Excel Parser

This Snap reads binary data in XLS and XLSX formats from the specified sheet and provides the output document data.
Note: This Snap supports XLS version 97 and above.

Limitations

  • Does not read password-protected Excel files and fails with the error: Failure: Failed to parse input data.
  • Does not support Excel 5.0/95 format XLS files and fails with the error: Input was neither an XLS stream nor an XLSX stream.
  • Does not support files greater than 4GB or files that have a high compression ratio (you may see an error that says "Zip bomb detected!". A high compression ratio is likely to happen when workbooks contain formatted cells with no values or excessive unused custom styles. Consider removing empty cells or removing unused custom styles and saving the workbook again. Learn more about inspecting and deleting a custom style. If you continue to have issues, try saving the Excel worksheet as a CSV file.

Snap views

View Description Examples of upstream and downstream Snaps
Input This Snap has exactly one binary input view, where it gets the Excel binary data to be parsed. File Reader
Output This Snap has exactly one document output view, where it provides the Excel document data stream. Mapper
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when an error occurs.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Legend:
  • Expression icon (): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field / Field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: Excel Parser

Example: Excel Parser
Sheet index String/Expression The name of the sheet that must be parsed.
Note:
  • It can use pipeline parameters or it can be hardcoded. It cannot use fields from the upstream Snap.
  • It overwrites the Sheet index field.
  • If the name of the sheet is a number (for example, 2018) use the Sheet name field.

Default value: 0

Example: 0 or Sheet1
Sheet name String/Expression Required. The name of the sheet that must be parsed.
  • It can use pipeline parameters or it can be hardcoded. It cannot use fields from the upstream Snap.
  • It overwrites the Sheet index field.

Default value: N/A

Example: PriceList
Start row String/Expression Required. Specifies the row number to start reading from the spreadsheet.
Note: This cannot be set to 1 if the Contains headers property is selected.

Default value: N/A

Example: 1
End row String/Expression Required. The row number to stop reading in the spreadsheet.
  • If selected, the Snap converts the value to a numeric-type cell.
  • If not selected, the Snap sets the value to a string-type cell.

Default value: N/A

Example: 100
Contains headers Checkbox This option specifies whether the input data contains headers. If this option is not selected, the Snap uses the Column names property for the header information when parsing Excel data.

Default status: deselected

Header row String/Expression
Specifies the row number in which headers are located in the input data. This property is ignored if Contains headers is false.
  • If Contains headers is true and Start row is smaller than or equals to Header row, the Start row is reset to Header row+1.

  • If Header row is empty, it is regarded as a default value of 1.

Default value: 1

Example: 10
Evaluate formulas Checkbox

When selected, the Snap checks if there are cell formulas to be evaluated and displays the results instead of the raw formulas.

Default status: Deselected

Round dates Checkbox

Select the checkbox to round numeric Excel date values to the closest second. You can enable the option when you have numeric Excel date values that have milliseconds and want to round off the milliseconds to the closest second.

For example, 1 Jan 2020 14:29:29.600 is rounded to 1 Jan 2020 14:29:30. 1 Jan 2020 14:29:59.700 is rounded to 1 Jan 2020 14:30:00.

Note: This Snap supports Internal Excel date format or date formatting characters, for example, the default date format for US dates is “m/d/yyyy” (1/27/2016). It does not support the date format with timezone such as UTC at the end, for example, 2020-09-15 00:00:00 UTC.

Default status: Not selected

Custom Locale String/Expression

Select a user-defined locale that the Snap should use to format numbers, instead of the default locale.

For example, if the default locale that is configured in the Snaplex uses "," (comma) to separate the integer and fractional parts, you can select any of the user-defined locale that uses "." (dot) to format numbers.

If you leave this field empty, then the default locale that is configured on the Snaplex is used. Click the Suggestion icon to view all the available locales on the Snaplex.

Default value: None

Example: English (Jamaica)
Cell formatting Dropdown list

Lets you determine whether to format numeral outputs. Valid options are:

  • Formatted: Determines the cell format used for numeric types. For example, Snap converts real numbers to two decimal places when formatted for currency.
  • Unformatted: Supports the unformatted output of numerals.

Default value: Formatted

Column names String/Expression Conditional. Specify a header to map the columns in the Excel sheet. If the Contains header property is not selected and column names are not set, the column names are taken from the Excel data, such as A and B.

Default value: N/A

Example: Item, Name, Description
Insert null columns Checkbox Required. This option specifies whether to insert nulls when rows have missing column data.

Default status: Unselected

Number of blank headers after the last header String/Expression Specify the number of additional headers that must be added.
Note: The Insert null columns property must be selected for this property to work.

Default value: 0

Snap execution Dropdown list
Choose one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Validate and Execute

Example: Execute only

Temporary Files

During execution, data processing on Snaplex nodes occurs principally in-memory as streaming and is unencrypted. When larger datasets are processed that exceeds the available compute memory, the Snap writes Pipeline data to local storage as unencrypted to optimize the performance. These temporary files are deleted when the Snap/Pipeline execution completes. You can configure the temporary data's location in the Global properties table of the Snaplex's node properties, which can also help avoid Pipeline errors due to the unavailability of space. For more information, see Configuration Options.