This example pipeline demonstrates how to analyze the different data
types in the input using the Type Inspector Snap. We also get their full class names with
the count of each data type.
You have an input data set with different data types which is passed to two Type Inspector
Snaps. The first Type Inspector Snap has the default configuration and analyzes the data
types that are present in the data set. The second Type Inspector Snap lists the data types
with the full class name and the Aggregated count of each data type.
-
Configure the JSON Generator Snap to pass your input data which contains fields of various data types.
Note: In this example, we use the JSON Generator Snap. However, you can
replace the JSON Generator Snap with any Snap of your choice, such as the
Chunker,
Constant,
File Reader, or
S3 File Reader Snaps.
-
Configure the Type Inspector Snap with
the default configuration to analyze the data types in the data set.
On validation, the Snap displays a comprehensive summary of the detected data types
such as numerical, categorical, textual, date/time, boolean types, and more. This
information aids in understanding the structure and format of the data, facilitating
further data processing and analysis tasks.
Type Inspector Snap (without Full class name +
Aggregate) configuration |
Type Inspector Snap (without Full class name
+ Aggregate) output |
|
|
-
Configure the Type Inspector Snap with
the full class name and aggregate options enabled to generate a detailed list of data
types, including their full class names and the aggregated count of each data type.
On validation, the Snap displays the detected data types with their full class names
and aggregated counts. This information ensures effective data processing, analysis, and
data quality.
Type Inspector Snap (with Full class name +
Aggregate) configuration |
Type Inspector Snap (with Full class name +
Aggregate) output |
|
|
Note: After the data is generated, you can use Snaps
such as the
Filter and
Aggregate Snaps for advanced processing. You can
also use
AgentCreator to integrate machine learning
models.
To successfully reuse pipelines:
- Download and import the pipeline into the SnapLogic Platform.
- Configure Snap accounts, as applicable.
- Provide pipeline parameters, as applicable.