Stream-Static Basic

Description

Suitable for when you have a streaming table that you need to join to one or many additional static tables to derive your desired target data set.

Use when:

  • You have a single streaming table driving the data flow and want to join to one or more other tables.

  • You only need to reflect changes when the driving streaming table updates.

  • The source tables do not share common business keys.

  • You only need to perform basic single row transforms.

Layers:

  • Silver

  • Gold (no complex transforms or aggregations)

Models:

  • 3NF such as ODS, Inmon and Enterprise Models

  • Data Vault

  • Dimensional: dimensions and basic transactional facts

Data Flow Components:

Stream Static - Basic

No.

Component

Description

M / O

1

Input View

Input view created over the streaming source table that will ultimately static join to one or more additional source tables, in the next view below. This view can optionally read from CDF if the source table is CDF enabled.

M

2

View

A view that defines SQL, joining the input view above to one or more additional source tables.

M

3

Append or Change Flow

An Append Flow (for transactional or fact based target tables) or an SCD1/2 Change Flow that loads the data into the final target table.

M

4

Target Table

A streaming table, the schema of which is specified in the dataflowspec. This table is the final target table for the given flow.

M

* M / O: Mandatory or Optional.

Feature Support

Supported

Not Supported

  • Append Only & SCD 1/2

  • Basic transforms such as: - Data type conversion - Concatenation - Single row calculations - Formatting - Cleansing & Data Quality Rules - Conditionals and calculations (single row) across multiple source tables - Joins

  • Complex transforms such as aggregations

  • Window By

Considerations and Limitations

Important

  • Updates in joined tables will not be reflected until a row with matching keys comes through on the driving streaming table.

Sample

  • Bundle: dlt_framework/src/samples/silver_sample

  • Sample: dlt_framework/src/samples/silver_sample/src/dataflows/stream_static_p6

Example Data Flow

Important

  • The Day 3 load below demonstrates a limitation of this pattern. Updates in any of the static tables will only be loaded once corresponding rows (with the same PK’s) in the primary streaming table are updated. This is resolved by the Streaming Data Warehouse pattern below.

Note this behavior may be perfectly acceptable in a given scenario, which is why this pattern remains relevant, as it is a simpler implementation than the Streaming Data Warehouse pattern

Day 1 Load

  • Source Tables (Append-Only)

    CUSTOMER

    customer_id

    first_name

    last_name

    email

    load_timestamp

    1

    John

    Doe

    john.doe@example.com

    2023-01-01 10:00

    2

    Jane

    Smith

    jane.smith@example.com

    2023-01-01 10:00

    CUSTOMER_ADDRESS

    customer_id

    city

    state

    load_timestamp

    1

    Melbourne

    VIC

    2023-01-01 10:00

    2

    Melbourne

    VIC

    2023-01-01 10:00

    4

    Hobart

    TAS

    2023-01-01 10:00

  • Target Table

    • Append-Only Scenario

    customer_id

    first_name

    last_name

    full_name

    email

    city

    state

    load_timestamp

    1

    John

    Doe

    John Doe

    john.doe@example.com

    Melbourne

    VIC

    2023-01-01 10:00

    2

    Jane

    Smith

    Jane Smith

    jane.smith@example.com

    Melbourne

    VIC

    2023-01-01 10:00

    • SCD1 Scenario

    customer_id

    first_name

    last_name

    full_name

    email

    city

    state

    1

    John

    Doe

    John Doe

    john.doe@example.com

    Melbourne

    VIC

    2

    Jane

    Smith

    Jane Smith

    jane.smith@example.com

    Melbourne

    VIC

    • SCD2 Scenario

    customer_id

    first_name

    last_name

    full_name

    email

    city

    state

    _START_AT

    _END_AT

    1

    John

    Doe

    John Doe

    john.doe@example.com

    Melbourne

    VIC

    2023-01-01 10:00

    NULL

    2

    Jane

    Smith

    Jane Smith

    jane.smith@example.com

    Melbourne

    VIC

    2023-01-01 10:00

    NULL

Day 2 Load

  • Source Tables (Append-Only)

    CUSTOMER

    customer_id first_name last_name email load_timestamp
    1 John Doe john.doe@example.com 2023-01-01 10:00
    2 Jane Smith jane.smith@example.com 2023-01-01 10:00
    1 John Doe jdoe@example.com 2023-01-02 10:00
    3 Alice Green alice.green@example.com 2023-01-02 10:00
    4 Joe Bloggs joe.bloggs@example.com 2023-01-02 10:00

    CUSTOMER_ADDRESS

    customer_id city state load_timestamp
    1 Melbourne VIC 2023-01-01 10:00
    2 Melbourne VIC 2023-01-01 10:00
    4 Hobart TAS 2023-01-01 10:00
    2 Perth WA 2023-01-02 10:00
    3 Sydney NSW 2023-01-02 10:00

Target Table

  • Append-Only Scenario

    customer_id first_name last_name full_name email city state load_timestamp
    1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00
    2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00
    1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00
    2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00
    3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW 2023-01-02 10:00
    4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00
  • SCD1 Scenario

    customer_id first_name last_name full_name email city state
    1 John Doe John Doe jdoe@example.com Melbourne VIC
    2 Jane Smith Jane Smith jane.smith@example.com Perth WA
    3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW
    4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS
  • SCD2 Scenario

    customer_id first_name last_name full_name email city state _START_AT _END_AT
    1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00 NULL
    1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
    2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00 NULL
    2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
    3 Alice Green Alice Green alice.green@example.com Sydney NSW 2023-01-01 10:00 NULL
    4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00 NULL

    Note

    • Customer 1’s change of location to Brisbane did not propagate as there was no corresponding new row in the primary Customer table on this Day.

    • This change in location will only be reflected once a new row for customer 1 is loaded into the Customer source table.

Day 3 Load

  • Source Tables (Append-Only)

    CUSTOMER

    customer_id first_name last_name email load_timestamp
    1 John Doe john.doe@example.com 2023-01-01 10:00
    2 Jane Smith jane.smith@example.com 2023-01-01 10:00
    1 John Doe jdoe@example.com 2023-01-02 10:00
    3 Alice Green alice.green@example.com 2023-01-02 10:00
    4 Joe Bloggs joe.bloggs@example.com 2023-01-02 10:00

    CUSTOMER_ADDRESS

    customer_id city state load_timestamp
    1 Melbourne VIC 2023-01-01 10:00
    2 Melbourne VIC 2023-01-01 10:00
    4 Hobart TAS 2023-01-01 10:00
    2 Perth WA 2023-01-02 10:00
    3 Sydney NSW 2023-01-02 10:00
    1 Brisbane QLD 2023-01-03 10:00
  • Target Table

    • Append-Only Scenario

      customer_id first_name last_name full_name email city state load_timestamp
      1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00
      1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00
      3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW 2023-01-02 10:00
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00
    • SCD1 Scenario

      customer_id first_name last_name full_name email city state
      1 John Doe John Doe jdoe@example.com Melbourne VIC
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA
      3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS
    • SCD2 Scenario

      customer_id first_name last_name full_name email city state _START_AT _END_AT
      1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00 NULL
      1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00 NULL
      2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
      3 Alice Green Alice Green alice.green@example.com Sydney NSW 2023-01-01 10:00 NULL
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00 NULL

      Note

      • Customer 1’s change of location to Brisbane did not propagate as there was no corresponding new row in the primary Customer table on this Day.

      • This change in location will only be reflected once a new row for customer 1 is loaded into the Customer source table.

Day 4 Load

  • Source Tables (Append-Only)

    CUSTOMER

    customer_id first_name last_name email load_timestamp
    1 John Doe john.doe@example.com 2023-01-01 10:00
    2 Jane Smith jane.smith@example.com 2023-01-01 10:00
    1 John Doe jdoe@example.com 2023-01-02 10:00
    3 Alice Green alice.green@example.com 2023-01-02 10:00
    4 Joe Bloggs joe.bloggs@example.com 2023-01-02 10:00
    1 John Doe john.doe@another.example.com 2023-01-04 10:00

    CUSTOMER_ADDRESS

    customer_id city state load_timestamp
    1 Melbourne VIC 2023-01-01 10:00
    2 Melbourne VIC 2023-01-01 10:00
    4 Hobart TAS 2023-01-01 10:00
    2 Perth WA 2023-01-02 10:00
    3 Sydney NSW 2023-01-02 10:00
    1 Brisbane QLD 2023-01-03 10:00
  • Target Table

    • Append-Only Scenario

      customer_id first_name last_name full_name email city state load_timestamp
      1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00
      1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00
      3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW 2023-01-02 10:00
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00
      1 John Doe John Doe john.doe@another.example.com Brisbane QLD 2023-01-04 10:00
    • SCD1 Scenario

      customer_id first_name last_name full_name email city state
      1 John Doe John Doe john.doe@another.example.com Brisbane QLD
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA
      3 Alice Green alice.green@example.com alice.green@example.com Sydney NSW
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS
    • SCD2 Scenario

      customer_id first_name last_name full_name email city state _START_AT _END_AT
      1 John Doe John Doe john.doe@another.example.com Brisbane QLD 2023-01-04 10:00 NULL
      1 John Doe John Doe jdoe@example.com Melbourne VIC 2023-01-02 10:00 2023-01-04 10:00
      1 John Doe John Doe john.doe@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
      2 Jane Smith Jane Smith jane.smith@example.com Perth WA 2023-01-02 10:00 NULL
      2 Jane Smith Jane Smith jane.smith@example.com Melbourne VIC 2023-01-01 10:00 2023-01-02 10:00
      3 Alice Green Alice Green alice.green@example.com Sydney NSW 2023-01-01 10:00 NULL
      4 Joe Bloggs Joe Bloggs joe.bloggs@example.com Hobart TAS 2023-01-02 10:00 NULL

      Note

      Both the change of address on day 3 and the change of email on day 4 come through on this load.