CSV Parser

Overview

You can use this Snap to read CSV binary data from its input view, parse it, and then write it to its output view as CSV document data.


Overview

Prerequisites

None.

Snap views

View Description Examples of upstream and downstream Snaps
Input This Snap has at most two binary input views, where it gets the CSV binary data to be parsed.

If there are two input views, it gets the CSV binary data to be parsed from the first input view and the CSV metadata from the second input view. The metadata should be a CSV format with two lines of CSV data: the first line is the CSV header, the second, data types. Supported data types are 'string', 'integer', 'float' and 'boolean'. If 'string' is a default data type, empty data type fields are considered to be 'string' type.

An example of CSV metadata is:

Last name,First name,age,commute_km,isDriving

string, ,integer,float,boolean

File Reader
Output This Snap has exactly one document output view, where it provides the CSV document data stream.
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when an error occurs.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Legend:
  • Expression icon (): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field / Field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: CSV Parser

Example: CSV_Parser_temp
Quote character String
Specify the character to be used for a quote. As of 4.3.2, this property can be an expression, which is evaluated with the values from the Pipeline parameters.
Note: Only a single character is allowed as a quote character.

Default value: "

Example: "
Delimiter String/Expression

Specify the string or the character to be used as a delimiter in formatting the delimited data. Any combination of characters may be used, adhering to the following guidelines.

The input must be submitted with any control characters escaped. For example, \t (tab), \n (new line), or \\ (single backslash) must be escaped accordingly. Unicode characters should be specified using the Unicode escape sequence \uXXXX, where each X represents a hexadecimal digit (0-9, a-f) with all four hexadecimal digits defined.
Important: When using a single backslash (\) as a delimiter, it does not need to be escaped (\\). However, if you are using a multi-character delimiter that contains one or more backslashes (\), you must escape all backslashes (\\).

Default value: ,

Example:
  • \t
  • \u0001
Escape Character String

Specify the escape character that is to be used when parsing rows. Only single characters are supported. As of 4.3.2, this property can be an expression, which is evaluated with the values from the pipeline parameters. Leave this property empty if no escape character is used in the input CSV data.

Default value: \

Example: \
Skip lines Integer/Expression Required. Specify the number of lines that are to be skipped in the input data before the Snap starts parsing it. This example explains how to skip lines.

Default value: 0

Example: 5

Contains header Checkbox

Select the checkbox to specify if the input data contains the CSV header or not.

Default status: Selected

Column names Use this field set to specify the column header names, which is a composite table property.
Note: This property is ignored if the second input view is used for the CSV metadata.
Important: You must either select Contains header or specify a Column name in order for validation on the pipeline to work.
Header String

Specify the list of headers that are to be used as a CSV header in case you deselect the Contains header property.

Example:
  • Last name
  • First name
  • Street
  • City
  • State
Validate headers Checkbox Select the checkbox to specify if the headers from the input data should be validated against the Column names table property or not. If this option is selected, the Snap throws an exception when they do not match exactly.

Default status: Deselected

Header size error policy Dropdown list Select an option to define how to handle errors for records that do not match the header columns in the CSV file. This error condition occurs if the input document has fewer or additional columns that do not match with the header columns. The available options are:
  • Trim record to fit header: The Snap trims the records to match the header columns and sends them to the output view if the values in the CSV file are more than the header columns. If the values are less than the header columns, the Snap sends the values as-is to the output view with blank spaces.
  • Fall if record is larger than header: The Snap sends the document to the output view if the values in the CSV file match with the header columns. Else, the Snap writes the output to the error view when the values in the CSV file are either more or less than the header columns.

  • Both: The Snap sends the trimmed records to the output view and also sends those records to the error view whose values are either more or less than the header columns.

    Default value: Both

    Example: Trim record to fit header

Character set Dropdown list Select an option to specify the character set in which input CSV data is encoded. The available options are:
  • Auto BOM detect: The Snap attempts to detect BOM (Byte Order Mark) in the input CSV data. If no BOM is found, the java runtime’s default character set is used.
  • UTF-8
  • UTF-16LE
  • UTF-16BE
  • ISO-LATIN-1: This character set is also called ISO-8859-1 and generally intended for Western European languages.

Default value: Auto BOM detect

Example: UTF-8

Ignore empty data Checkbox Select this checkbox to send the document to error view if the input is empty.

If you do not select this checkbox, then it produces an empty output document when the input CSV data is empty (both an empty binary stream and a binary stream with CSV headers only). This feature may be useful if the downstream Snaps should be executed whether the input CSV data is empty or not.

Default status: Selected

Preserve Sorrounding Spaces Checkbox Select this checkbox to preserve the surrounding spaces for the values that are non-quoted.
  • If you enable the expression icon when the checkbox is selected, then the value of this setting is set to true.
  • If you enable the expression icon when the checkbox is not selected, then the value of this setting is set to false.
Note: This setting is applicable only for unquoted data.

For example, if you are using data with a delimiter as follows:

NAME|AGE|GENDER

AA|12| F

BB| 23| M

If you deselect this checkbox, then surrounding spaces are removed before 12, 23, F, and M.

Default status: Deselected

Snap execution Dropdown list
Choose one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Validate and Execute

Example: Execute only

Examples