Snowflake SCD2

Overview

The Snowflake SCD2 is a Read-type Snap that enables you to execute multiple queries as a single atomic unit.


Snowflake SCD2 Configuration

  • Read-type Snap
  • Works in Ultra Tasks. However, we recommend that you not use this Snap in an Ultra Pipeline

Prerequisites

Read and write access to the Snowflake instance.

The target table should have the following three columns for field historization to work:
  • Column to demarcate whether a row is a current row or not. For example, "CURRENT_ROW". For the current row, the value would be true or 1. For the historical row, the value would be false or 0.
  • Column to denote the starting date of the current row. For example, "START_DATE".
  • Column to denote when the row was historized. For example, "END_DATE". For the active row, it is null. For a historical row, it has the value that indicates it was effective till that date.

You must have minimum permissions on the database to execute Snowflake Snaps. To understand if you already have them, you must retrieve the current set of permissions. The following commands enable you to retrieve those permissions:

SHOW GRANTS ON DATABASE <database_name>
        SHOW GRANTS ON SCHEMA <schema_name>
        SHOW GRANTS TO USER <user_name>
Security Prererequisites
You must have the following permissions in your Snowflake account to execute this Snap:
  • Usage (DB and Schema): Privilege to use the database, role, and schema.
  • Create table: Privilege to create a temporary table within this schema.

Learn more about Snowflake privileges: Access Control Privileges.

Internal SQL Commands

This Snap uses the SELECT command internally. It enables querying the database to retrieve a set of rows.

Known Issues

Because of performance issues, all Snowflake Snaps now ignore the Cancel queued queries when pipeline is stopped or if it fails option for Manage Queued Queries, even when selected. Snaps behave as though the default Continue to execute queued queries when the Pipeline is stopped or if it fails option were selected.

Snap views

View Description Examples of upstream and downstream Snaps
Input A document in the input view should contain a data map of key-value entries. The input data must contain data in the Natural Key (primary key) and Cause-historization fields.
Output A document in the output view contains a data map of key-value entries for all fields of a row in the target Snowflake table.
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when an error occurs.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Legend:
  • Expression icon (): Allows using JavaScript syntax to access SnapLogic Expressions to set field values dynamically (if enabled). If disabled, you can provide a static value. Learn more.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field / Field set Type Description
Label String Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if there are more than one of the same Snap in the pipeline.
Schema name String/Expression Required. Specify the database schema name. In case it is not defined, then the suggestion for the Table Name will retrieve all tables names of all schemas. The property is suggestible and will retrieve available database schemas during suggest values.

Default value: N/A

Example: TestSchema

Table Name String/Expression Specify the name of the table in the instance. The table name is suggestible and requires an account setting.
Note: The target table should have the following three columns for field historization to work:
  • Column to demarcate whether a row is a current row or not. For example, "CURRENT_ROW". For the current row, the value would be true or 1. For the historical row, the value would be false or 0.
  • Column to denote the starting date of the current row. For example, "START_DATE".
  • Column to denote when the row was historized. For example, "END_DATE". For the active row, it's null. For a historical row, it has the value that indicates it was effective till that date.
Use the ALTER table command to add these columns to your target table if they are not present.

Default value: N/A

Example: TestTable

Natural key String/Expression Specify the names of fields that identify a unique row in the target table. The identity key cannot be used as the Natural key, since a current row and its historical rows cannot have the same natural key value.

Default value: N/A

Example: id (Each record has to have a unique value)

Cause-historization fields String/Expression Specify the names of fields where any change in value causes the historization of an existing row and the insertion of a new current row.

Default value: N/A

Example: gold bullion rate

SCD fields The historical and updated information for the Cause-historization field. Click + to add SCD fields. By default, there are four rows in this fieldset:
  • Current row
  • Historical row
  • Start date of the current row
  • End date of historical row
Meaning Dropdown list Specifies the table columns that are to be updated for implementing the SCD2 type transformation.
Default value:
  • Current row
  • Historical row

Example: Historical row

Field String/Expression Specify the fields in the table will contain the historical information. Below are the values that must be configured for each row:
  • Current row: The name of the column in the target table that holds the flag for the historized field. For example, "CURRENT_ROW".
  • Historical row: The name of the column in the target table that holds the flag for the historized field. It has to be the same as the value configured for the Current row field. For example, "CURRENT_ROW".
  • Start date of current row: The name of the column in the target table for denoting the start date for the current row. For example, "START_DATE".
  • End date of historical row: The name of the column in the target table for denoting the end date for the historical row. For example, "END_DATE".
Note: By default, the start and end date for both Current row and Historical row are null. After the Snap is executed, the start date for the updated row data automatically becomes the end date for the earlier version of the data (Historical row).

Default value: N/A

Example: CURRENT_ROW

Value String/Expression Specify the value to be assigned to the current or historical row. For date-related rows, the default is Date.now(). The Value field should be configured as follows:
  • Current row: 1
  • Historical row: 0
Default value:
  • Current row and Historical row: N/A
  • Start date of current row, and End date of historical row: Date.now()

Example: Historical row

Ignore unchanged rows Checkbox Specifies whether the Snap must ignore writing unchanged rows from the source table to the target table. If you enable this option, the Snap generates a corresponding document in the target only if the Cause-historization column in the source row is changed. Else, the Snap does not generate any corresponding document in the target.

Default status: Deselected

Number of retries Integer/Expression Specify the maximum number of retry attempts when the Snap fails to read.

Minimum Value: 0

Default value: 0

Example: 3

Retry interval (seconds) Integer/Expression Specifies the minimum number of seconds the Snap must wait before each retry attempt.

Minimum Value: 1

Default value: 1

Example: 3

Auto Historization Query This field set is used to specify the fields that are to be used to historize table data. Historization is in the sort order specified. Care must be taken that the field is sortable. You can also add multiple fields here; historizaton occurs when even of the fields is changed.
Field String/Expression Specify the name of the field. This is a suggestible field and suggests all the fields in the target table.
Note: If this field has null values in the incoming records, then the value in the Snowflake table is treated as the current value and the incoming record is historized.

Default value: N/A

Example: Invoice_Number

Sort Order Dropdown list The order in which the selected field is to be historized. Available options are:
  • Ascending Order: The higher value is classified as a current event. For example date of transaction, age, height, etc.
  • Descending Order: The lower value is classified as the current event. For example, rank.

Default value: Ascending Order

Example: Descending Order

Input Date Format Dropdown list The property has the following two options:
  • Select Continue to execute the snap with the given input Date format if you want the Snap to continue with the current date format. This option is selected by default.
  • Select Auto Convert the format to Snowflake default format if you want the Snap to convert the provided date format to the default Snowflake date format. To know about the date formats supported by Snowflake, see Snowflake date formats

Default value: Continue to execute the snap with given input Date format

Example: Auto Convert the format to Snowflake default format

Manage Queued Queries Dropdown list
Default value: Select an option to determine whether the Snap should continue or cancel the execution of the queued Snowflake Execute SQL queries when you stop the pipeline.
Note: If you select Cancel queued queries when the pipeline is stopped or if it fails, then the read queries under execution are canceled, whereas the write type of queries under execution are not canceled. Snowflake internally determines which queries are safe to be canceled and cancels those queries.

Default value: Continue to execute queued queries when the pipeline is stopped or if it fails

Example: Cancel queued queries when the pipeline is stopped or if it fails

Snap execution Dropdown list Choose one of the three modes in which the Snap executes.
Available options are:
  • Validate & Execute. Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only. Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled. Disables the Snap and all Snaps that are downstream from it.