Azure Synapse SQL Bulk Load

Overview

You can use this Snap to perform a bulk load operation from an external location to the target table. The input data is first written to a temporary data file and is loaded from the data file into the target table.


Azure Synapse SQL Bulk Load Snap Overview

  • This is a Write-type Snap.
  • Does not support Ultra Tasks

Prerequisites

Limitations

When the table name contains single quote characters, the No table found error is displayed. This is because of the JDBC driver limitation.

Known issues

None.

Snap views

Type Description Examples of upstream and downstream Snaps
Input

A document containing the data to be bulk loaded into the table.

Output

Document containing the success message of the bulk load operation.

Learn more about Error handling.

Snap settings

Legend:
  • Expression icon (): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field/Field set Description
Label

String

Required.Specify a unique name for the Snap.

Default value: Azure Synapse SQL - Bulk Load

Example: Bulk_Load

Schema Name

String/Expression

Specify the database schema name. In case it is not defined, then the suggestion for the table name retrieves all tables names of all schemas. The property is suggestible and retrieves available database schemas during suggest values.

Default value: None

Example: SYS

Table Name

String/Expression

Required.Specify the table name or select the table from the suggestion list to load the incoming data into.

Default value: None

Example: users

Data Source

Dropdown list

Select either of the following sources from where the data must load:

  • Input view: The data is loaded from the upstream Snap.
  • External storage: The data is loaded from an external source.
Important: When the Data Source is Input View:
  • The ROWTERMINATOR is added by default for Copy arguments.
  • The incoming data is written to a temporary file in the Snaplex and then uploaded to the staged location (Azure storage) using the Azure API. After uploading the files to Azure storage, the Snap runs the COPY INTO command to load the data from files to table. The temporary files are deleted after the execution (failed or successful) is completed. Hence, if you are running huge data Pipelines, we recommend you to configure your Snaplex instance disk space accordingly.

Default value: Input View

Example: External Source

Create table if not present

Checkbox

Appears only when you select Input view for Data Source.

Select this checkbox to create the target table if it does not exist. Otherwise, the system displays the "table not found" error.

Important:

If a target table does not exist when the Snap tries to do the bulk load, and if you select this checkbox, the Snap creates the table with the columns and data types. If you want a table to be created with the same schema as a source table, you can include a second input view for this Snap. This view can be used to pass metadata about the table, effectively allowing you to replicate a table from one database to another.

The table metadata document that is read in by the second input view contains a dump of the JDBC DatabaseMetaData class. The document can be manipulated to affect the CREATE TABLE statement that is generated by this Snap. For example, to rename the name column to full_name, you can use a Mapper Snap that sets the path $.columns.Name.COLUMN_NAME to full_name.

The Snap does not automatically fix the errors encountered during table creation, because it may require user intervention to resolve correctly. For example, if the source table contains a column with a type that does not have a direct mapping in the target database, Snap fails to execute. In such a case, add a Mapper Snap to change the metadata document to explicitly set the values required to produce a valid CREATE TABLE statement.

Default value: Deselected

Purge files

Checkbox

Appears only when you select External storage for Data Source.

Select this checkbox if you want to purge the data files automatically from the external storage after the data is loaded successfully.

Warning:
  • The Snap automatically deletes the staged files when Data Source is Input view.
  • Use file pattern and purge files combination cautiously as it might delete the data which is irreversible.

Default value: Deselected

Example: Selected

Table Column Settings

Use this field set to define columns to map the source data to the columns in the target table. The source data can be either from the input view or external storage.

Column Name

String/Expression/ Suggestion

Specify the column name in the target table.

Default value: N/A

Example: $first_name

Default Value

String/Expression

Specify the default value that replaces null value if any in the input file.

Default value: N/A

Example: new

Source Column Position

String/Expression

Appears only when you select External storage for Data Source.

Specify the position of the column for the source file within a row.

Default value: None

Example: Second

Add Quotes

Checkbox

Select this checkbox to enclose default value in quotes.

Default value: Deselected

File List

Appears only when you select External source for Data Source.

File

String/Expression

Appears only when you select External source for Data Source.

Use this fieldset to specify the list of files that must be loaded to the target table. Specify the file to be loaded to the target table.

File Name Pattern

String/Expression

Appears only when you select External source for Data Source.

Specify an expression or string that indicates the absolute path of the file names to match the files in the external location.

File Format Type

Dropdown list

Appears only when you select External source for Data Source.

Choose one of the following file format types to load data into the target table or unload data from the target table:

  • CSV
  • PARQUET
  • ORC
Note: When the file format type is PARQUET or ORC and you specify the FILE_FORMAT in the Copy Arguments fieldset, the Snap overrides the FILE TYPE.
Copy Arguments
With

String/Expression

Use this fieldset to configure the list of arguments that you want the Snap to generate for the copy command. Specify the list of arguments to be used when loading the data. For example, the argument With MAXERRORS=1000 enables the Snap to ignore 1000 record errors and continues with the execution of the Snap and terminates the operation after exceeding the 1000 errors.

When the Data Source is Input View, this Snap suggests few copy arguments that allows you to configure the arguments. The Snap uses default values for the following properties (as shown below) to prepare the external stage file:

  • MAXERRORS=0
  • FIELDQUOTE is quote character (")
  • FIELDTERMINATOR is a (,)
  • ROW TERMINATOR is \r\n
Warning: Changes to the Snap's above-mentioned default values may result in failures with the bulk operation.

Default value: None

Example: ENCODING={'UTF8'|'UTF18'}

Parallel Transfer Options

This property enables faster loading of data.

Max Concurrency

Integer

Required.Specify the maximum number of parallel requests that are to be issued at a given time as part of a single parallel transfer.

Default value: 5

Example: 10

Block Size (MB)

Integer

Required.Specify the size of data that must be chunked to transfer at a time.

Default value: 2

Example: 6

Snap execution

Dropdown list
Choose one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Execute only

Example: Validate & execute