Detect and reconcile changes

In the context of data integration and orchestration, the term schema drift describes a period when the state of the source differs from that of the target. After an initial load, during synchronization, SnapLogic AutoSync detects changes to the source schema. When possible, AutoSync propagates those changes to the destination.

The types of schema drift that AutoSync can reconcile include:

  • Column name changes.
  • Column additions and deletions.
  • For all destinations, some data type changes.
  • For Salesforce sources, changes to the column precision, scale, or character limit when the new value is larger than the old one.
  • For Salesforce sources, the option to include deleted records.

When a table exists in the target, but has been removed from the source, it causes a failure on the next run of the data pipeline. The AutoSync error message provides a Handle missing tables link. Click it to open a dialog from which you can remove the deleted table from the data pipeline. The next run should be successful.


Failed to load error message

How AutoSync reconciles schema drift depends on the type of change and the functionality supported by the endpoints. The following sections provide more details.

Column additions and deletions

After the initial load, AutoSync detects additions and deletions of columns in the source tables:

  • For an added column, AutoSync adds it to the target.
  • For a deleted column, AutoSync retains the column and its values in the target. If a record is later added to the source table, AutoSync adds the row, and inserts a null value in the deleted column.
Tip: To completely remove a deleted column from the target, remove it manually.

Addition and deletion examples

AutoSync detects additions and deletions made to the source since the last load. You or a colleague can also change target data, but AutoSync never updates the source, synchronization is one way. Be cautious with manual changes to the target, because it can have unexpected results. The following scenarios illustrate some of these.

Scenario 1: No change in source, table, or records deleted in target, full load. A user manually removes a table from the target but doesn't remove the corresponding source object or table. The next time the data pipeline runs, AutoSync adds the table back to the target because it's still in the source. The same occurs if a user deletes the records from a table in the target but doesn't delete those records from the source. The next time the data pipeline runs, AutoSync adds the records back because they still exist in the source.

Scenario 2: No change in source, records deleted in target, incremental load. A user deletes some records from a table in the target and doesn't make any changes to the source. If the table in the target still has at least one record, on the next run AutoSync does nothing because there was no change in the source. The deleted records aren't added back to the target.

Scenario 3: Changes to source and target, incremental load. A user adds a record to a table in the source and then deletes a record from the corresponding table in the target. The table in the target still has at least one record. The next time the data pipeline runs, the record added to the source also appears in the target, but the record deleted from the target isn't added back.

Changes to data types

For all destinations, AutoSync can propagate some data type changes from the source. AutoSync does not propagate changes that would result in data loss, such as changing an integer 100 to an integer 95. If the change isn't supported, AutoSync logs a message and skips column updates.

Since destinations don't always support all source data types, AutoSync converts data to a compatible type. To do this, it first converts data types to a common model, then from the common model to the destination data type. A subset of the results of source to target mappings are documented in How AutoSync maps data. Likewise, when detecting a change in a source data type, AutoSync propagates the change using a compatible type in the destination. AutoSync uses the corresponding data type that requires the least amount of storage.

For some data types, the only compatible type is a string. For example, if a date in the source is changed to time AutoSync changes the destination type to string. The values of a date can't be converted to time values. In this example, the string type is compatible with the original and the changed types and takes up the least space.

The following table shows supported type changes for generic data types:

Table 1. Supported data type changes
Source original type Source changed type Notes
Any type string
int int The change must be to a larger size. For example, int(8) to int(32), not int(32) to int(8).
int decimal
float(4) float(8)
char string
date datetime
date timestamp
datetime timestamp
float timestamp
time string
timestamp string
binary string

Using these generic types, you can extrapolate most results in the destination by consulting the documented endpoint-specific mappings. However, if the changed type is not convertible to the documented type, AutoSync converts to a string, which is compatible with the original and the changed data type. For example, the following table shows how AutoSync propagates changes to data types in SQL Server to Snowflake. Note that FLOAT and DECIMAL in SQL Server must be converted to a string and become a VARCHAR type in the destination.

Table 2. SQL Server data type changes propagated to a Snowlake destination
Source original Source changed Destination original Destination changed
SMALLINT INT NUMBER(38, 0) NUMBER(38, 0)
VARCHAR(50) NVARCHAR(100) VARCHAR(50) VARCHAR(100)
DATE DATETIME DATE TIMESTAMP_NTZ
BIGINT FLOAT NUMBER(38, 0) VARCHAR(100)
FLOAT DECIMAL(8, 4) FLOAT VARCHAR(MAX)
REAL NUMERIC(10, 5) FLOAT VARCHAR(MAX)
TEXT NVARCHAR(MAX) VARCHAR(100) VARCHAR(MAX)
TIME(3) TIME TIME TIME

Salesforce option to include deleted records

Salesforce objects have an isDeleted field that indicates whether a record has been moved to the recycle bin for deletion. Eventually, Salesforce empties the recycle bin and permanently deletes the records. You might want to have records that have been marked for deletion loaded to the target to keep a history of deletions.

To add records marked for deletion to the target, AutoSync configuration gives you the option to Include deleted records:

  • If unchecked, AutoSync only loads records with a value of False in the isDeleted field.
  • If checked, AutoSync loads all records, including those with a value of True in the isDeleted field.
Tip: To prevent deleted records from permanently taking up space in the target, you can create a script to remove and archive them.

The following table summarizes the behavior of the Include deleted records option:

State of the Include deleted records option First full load After synchronization
Unchecked No deleted records loaded to target No deleted records loaded to target
Checked Deleted records loaded to target Records deleted since last load loaded to target
Checked and run at least once, then unchecked Deleted records loaded to target And after unchecking the option, records deleted since last load are not loaded to target
Unchecked and run at least once, then checked No deleted records loaded to target And after checking the option, records deleted since last load are loaded to target
Tip: To change a data pipeline to include deleted records and include all marked for deletion, not just those deleted since the last synchronization, you can drop the target table in the destination. The next time AutoSync synchronizes, it does a full reload.

Learn more about Salesforce connection configuration.

Changes to column scale. precision, or character limit in Salesforce

The following Salesforce usage is prerequisite to understanding AutoSync reconciliation:

  • Precision is the total digits allowed in a numeric field.
  • Scale is the number of digits to the right of the decimal point in a numeric field.
  • Text fields have a character limit and Salesforce uses 3 bytes to store each character. The corresponding character limit in the target will be 3 times the Salesforce limit.

Data pipelines with Salesforce as a source and Amazon Redshift, Snowflake, or Google BigQuery as a target can reconcile increases to scale or precision for numeric fields and increases to character limits for text fields. AutoSync cannot reconcile reductions in scale, precision, or character length because it cannot transform existing data in the target to have fewer digits, fewer decimal places, or characters.

If you increase the scale, precision, or character limit of a field in Salesforce, AutoSync will modify the corresponding target column. Existing data in the target remains the same. AutoSync synchronizes new or changed data using the updated constraints.

Important: AutoSync can only modify the scale, precision, or character limit to the largest size supported by the target. If a constraint in Salesforce is raised to a larger value than the target supports, AutoSync increases the target constraint to the maximum size. Be aware that this can result in data loss because AutoSync must truncate data that exceeds the size of the target constraint.

The following examples show how AutoSync modifies the target after changes to the Salesforce constraints on text and numeric fields:

Original field Modified in source AutoSync modifies target
string 20 character limit string 50 character limit The target type that corresponds to string with a 150 character limit. For example, in Snowflake VARCHAR(150)
integer 10 digit limit, no decimal decimal 14 digit limit, 2 decimal places The target type that corresponds to a numeric type with a 14 digit limit, 2 decimal places. For example, in Snowflake, number(14,2)