Analyze data types

This example pipeline demonstrates how to analyze the different data types in the input using the Type Inspector Snap. We also get their full class names with the count of each data type.

You have an input data set with different data types which is passed to two Type Inspector Snaps. The first Type Inspector Snap has the default configuration and analyzes the data types that are present in the data set. The second Type Inspector Snap lists the data types with the full class name and the Aggregated count of each data type.

Download this pipeline.

Configure the JSON Generator Snap to pass your input data which contains fields of various data types.

Note: In this example, we use the JSON Generator Snap. However, you can replace the JSON Generator Snap with any Snap of your choice, such as the Chunker, Constant, File Reader, or S3 File Reader Snaps.

Configure the Type Inspector Snap with the default configuration to analyze the data types in the data set.

On validation, the Snap displays a comprehensive summary of the detected data types such as numerical, categorical, textual, date/time, boolean types, and more. This information aids in understanding the structure and format of the data, facilitating further data processing and analysis tasks.


Type Inspector Snap (without Full class name + Aggregate) configuration	Type Inspector Snap (without Full class name + Aggregate) output

Configure the Type Inspector Snap with the full class name and aggregate options enabled to generate a detailed list of data types, including their full class names and the aggregated count of each data type.

On validation, the Snap displays the detected data types with their full class names and aggregated counts. This information ensures effective data processing, analysis, and data quality.


Type Inspector Snap (with Full class name + Aggregate) configuration	Type Inspector Snap (with Full class name + Aggregate) output

Note: After the data is generated, you can use Snaps such as the Filter and Aggregate Snaps for advanced processing. You can also use AgentCreator to integrate machine learning models.

To successfully reuse pipelines:

Download and import the pipeline into the SnapLogic Platform.
Configure Snap accounts, as applicable.
Provide pipeline parameters, as applicable.