PostgreSQL Bulk Load

Overview

Use this Snap to bulk load data into PostgreSQL tables. If the target table does not exist in PostgreSQL, you also have the provision to create the table. This Snap supports both binary and document inputs. You can configure the input view in the Views tab.



  • This is a Write-type Snap.
  • Does not support Ultra Tasks

Prerequisites

None.

Support and Limitations

  • If the target table does not exist and the input is binary, the Snap does not fetch the table's schema from the input. The table's schema must be passed to the Snap through the second input view.
  • Known Issue: If database metadata from an upstream Snap contains geography column data such as modifiers, those modifiers may not be written to the target table.

    Workaround: To write geographic data to the target table, create the table using the PostgreSQL Execute Snap.

Snap views

Type Description Examples of upstream and downstream Snaps
Input The data to be loaded into the PostgreSQL table. This Snap supports a maximum of two input streams which is useful in sending table schema to the Snap if the target table does not exist. You can mix and match binary and document input views based on the input type.
Note: If the schema is from a different database, there is no guarantee that all the data types will be properly handled.
Output The Snap outputs one document specifying the status with the records count that are being inserted into the table.
Learn more about Error handling.

Snap settings

Note: Learn about the common controls in the Snap settings dialog.
Field/Field set Type Description
Label

String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: PostgreSQL Bulk Load

Example: Load Employee Data
Schema name

String/Expression/ Suggestion

Optional. Specify the database schema name in PostgreSQL. In case it is not defined, then the suggestion for the Table name retrieves all table names of all schemas. The property is suggestible and will retrieve available database schemas during suggest values.

Default value: N/A

Example: Employee
Table name

String/Expression/ Suggestion

Required. Specify the PostgreSQL table on which to execute the bulk load operation.

Default value: N/A

Example: Employee.PersonnelInformation
Create table if not present

Checkbox

Select this checkbox to automatically create the target table if it does not exist.
  • If a second input view is configured for the Snap and it contains a document with schema (metadata) from the source table, the Snap creates the new (target) table using the same schema (metadata). However, if the schema comes from a different database, the Snap might fail with the Unable to create table: "<table_name>" error due to data type incompatibility.
  • In the absence of a second input view, the Snap creates a table based on the data types of the columns generated from the first row of the input document (first input view).
  • Due to implementation details, a newly created table is not visible to subsequent database Snaps during runtime validation. If you want to immediately use the newly updated data you must use a child Pipeline that is invoked through a Pipeline Execute Snap.
  • If database metadata from an upstream Snap contains geography column data such as modifiers, those modifiers may not be written to the target table. For example, if the column metadata contains a polygon modifier, that modifier will not be written to the target table.

    Workaround: Create the table using the PostgreSQL Execute Snap.

Default status: Deselected

Columns Optional. Specify the columns to be loaded/copied. If you do not configure this field set, then all columns in the input data stream are loaded into the target table. You can also set how the null values should be handled for each column.
Note:
  • Column order is critical when using the binary input view.
  • Force Non-Null and Force Null can both be selected simultaneously. In such cases, the Snap converts quoted null strings to null values and unquoted null strings to empty strings.
Column

String/Suggestion

Specify the name of the column/field. This is a suggestible field and lists all field names in the input.

Default value: N/A

Example: emp_name
Force Non-Null

Checkbox

Select this checkbox to not match the selected column's value against the null string. The Snap reads empty null strings as zero-length strings rather than nulls, even when they are not quoted.

Default status: Deselected

Force Null

Checkbox

Select this checkbox to match the selected column value against the null string, even if it has been quoted. If a match is found, the field value is set to NULL.

By default, where the null string is empty, the Snap converts a quoted empty string into NULL.

Default status: Deselected

Format

Dropdown list

Choose the data format to be written. This field is applicable only when using the binary input view. Available options:
  • Text - tab-separated-values (default)
  • CSV - comma-separated-values
  • Binary - PostgreSQL specific format

Default value: Text

Example: CSV
Header Provided

Checkbox

Select this checkbox to include the input data has a header. Applicable only when using binary input view and CSV option in the Format field.

Default status: Deselected

Encoding

Dropdown list

Choose the encoding to be used. This is limited to the encodings supported by the PostgreSQL server. The available options are:
  • ABC
  • ALT
  • BIG5
  • EUC_CN
  • EUC_JIS_2004
  • EUC_JP
  • EUC_KR
  • EUC_TW
  • GB18030
  • GBK
  • ISO88591
  • ISO885910
  • ISO885913
  • ISO885914
  • ISO885915
  • ISO885916
  • ISO88592
  • ISO88593
  • ISO88594
  • ISO88599
  • ISO_8859_5
  • ISO_8859_6
  • ISO_8859_7
  • ISO_8859_8
  • JOHAB
  • KOI8
  • KOI8R
  • KOI8U
  • LATIN1
  • LATIN2
  • LATIN3
  • LATIN4
  • LATIN5
  • LATIN7
  • LATIN9
  • Mskanji
  • SHIFT_JIS_2004
  • SJIS
  • SQL_ASCII
  • ShiftJIS
  • TCVN
  • TCVN5712
  • UHC
  • UTF8
  • Unicode
  • VSCII
  • WIN
  • WIN1250
  • WIN1251
  • WIN1252
  • WIN1253
  • WIN1254
  • WIN1255
  • WIN1256
  • WIN1257
  • WIN1258
  • WIN866
  • WIN874
  • WIN932
  • WIN936
  • WIN949
  • WIN950
  • Windows932
  • Windows936
  • Windows949
  • Windows950

Default value: UTF8

Example: Unicode
Snap Execution

Dropdown list

Choose one of the three modes in which the Snap executes. Available options are:

  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Execute only

Example: Validate & Execute

Temporary files

During execution, data processing on Snaplex nodes occurs principally in-memory as streaming and is unencrypted. When processing larger datasets that exceed the available compute memory, the Snap writes unencrypted pipeline data to local storage to optimize the performance. These temporary files are deleted when the pipeline execution completes. You can configure the temporary data's location in the Global properties table of the Snaplex node properties, which can also help avoid pipeline errors because of the unavailability of space. Learn more about Temporary Folder in Configuration Options.