XML Parser

Overview

You can use this Snap to parse the incoming XML data into SnapLogic document objects. The supported schema language is: W3C XML Schema 1.0

Parse-type Snap
Works in Ultra Tasks

Limitations

The XML Parser Snap does not support mixed content, such as the following XML data, because it may contain attributes, elements, and text.

<letter>
       Dear Mr. <name>John Smith</name>.
       Your order <orderid>1032</orderid>
       will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>

Snap views


View	Description	Examples of upstream and downstream Snaps
Input	The input must be: XML-formatted data in binary form UTF-8 encoded data (if non-UTF-8 encoded data is passed, it may result in errors) The input must be properly structured XML data without mixed content elements for the XML Parser to process it correctly. Example of Valid Input `<letter> <name>John Smith</name> <orderid>1032</orderid> <shipdate>2001-07-13</shipdate> </letter>`	File Reader XSLT
Output	Each XML element is converted into a corresponding field in the output document. The output maintains the hierarchical structure of the original XML. It can be processed by any downstream Snap that accepts document input.	XML Generator
Error	Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are: Stop Pipeline Execution Stops the current pipeline execution when an error occurs. Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records. Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution. Learn more about Error handling in Pipelines.

Snap settings

Legend:

Expression icon (): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
Upload : Uploads files. Learn more.

Learn more about the icons in the Snap settings dialog.


Field / Field set	Type	Description
Label	String	Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline. Default value: XML Parser Example: XML Parser
Inbound schema	String/Expression	XSD schema definition file url for the incoming data. The currently supported url protocols are SLDB, HDFS, S3. Important: If you enter an Inbound schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema. Default value: None Example: sldb:///foo/bar/customer.xsd
Validate XML	Checkbox	Required. Appears when you enable expression for Inbound schema. If selected, the incoming data will be validated against the provided XSD schema definition. Note: If you enter an Inbound schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema. Default status: Deselected
Match data types	Checkbox	Select this checkbox to convert the output document data types to the data type as specified in the inbound schema property. Note: Supported design XSD files: Russian Doll Supported data types: xs:string, xs:int, xs:integer, xs:long, xs:short, xs:byte, xs:float, xs:double, xs:decimal, and xs:boolean. Salami Slice, Venetian Blind, and Garden of Eden design XSD files are not supported. If you enter an Input schema, then you must select Validate XML and Match data types properties to derive the output as per the defined schema. Default status: Deselected
Splitter	String	Specify the value to split the incoming XML document into multiple smaller documents using the XPath expression. Note: This expression must be of the form `a/b/c/d` or `ns1:a/ns2:b/ns3:c/ns4:d` where the prefixes ns1 to 4 can be the same or different. Learn more. Default value: None. Example: d:catalog/d:book
Namespace Context	Optional. Namespace context for the expression provided in the Splitter property. Namespaces are typically defined in the format of xmlns prefix:URI
Prefix	String	Prefixes included in the expression provided in the Splitter property.
URI	String	URIs associated with the prefixes.
Optimization	Dropdown list	Select the parameter that you want to optimize during Snap execution. Available options: None: Continues with standard memory consumption and speed Memory: Leads to lower memory consumption and slower execution Speed: Leads to higher memory consumption and faster execution Default value: None.
Snap execution	Dropdown list	Choose one of the three modes in which the Snap executes. Available options are: Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime. Execute only: Performs full execution of the Snap during pipeline execution without generating preview data. Disabled: Disables the Snap and all Snaps that are downstream from it. Default value: Execute Only Example: Validate and Execute

Spiltters

Splitter expression without prefix

Example: breakfast_menu/food

Default namespace can be accessed by giving a unique prefix in the splitter expression followed by a colon and the tag value. Provide its corresponding namespace value in the Prefix URI table. Ensure this prefix is not used in the XML before using it.

If the XML data is of the form:

The output will be:

Splitter expression with prefix

Example: d:catalog/d:book

For the Splitter expression: "d:catalog/d:book”, the output contains two output documents—one for each note tag in the XML file. If the Splitter expression contains prefixes, they must be defined in the Namespace Context.


Prefix	URI
d	http://www.develop.com/student

In the Settings, enter d:catalog/d:book in the Splitter field and http://www.develop.com/student in URI field to get the output view containing the data with the prefix 'd'.

The output view is:

Troubleshooting


Error	Reason	Resolution
"Failed to convert xml to json"	"Unexpected character."	Ensure that the xml data is well formed.