This pipeline demonstrates training a model to predict whether a weighing scale is balanced. The classification algorithm is selected based on the algorithm evaluation in the Cross Validator (Classification) Snap's example. The input dataset depicts the weight on each side of the scale and the side's distance from the floor.
-
Configure the CSV Generator Snap to generate the input dataset.
Note: The input document contains one classification field and three numeric fields:
- Balance Class: The classification field to denote the status of the weighing scale. B for Balanced, L for Left-inclined, and R for Right-inclined.
- Left Weight
- Left Distance
- Right Weight
- Right Distance
-
Pass the input document through the Type Converter Snap to automatically detect and convert data types.
Note: Although analyzing the input document using the Profile Snap and the Type Inspector Snap is recommended to ensure data integrity (e.g., no null values or inaccurate data types), this step is skipped here for simplicity.
-
Configure the Trainer (Classification) Snap to train the model for the dataset.
The classification algorithm was evaluated in the Cross Validator (Classification) Snap. The Trainer (Classification) Snap is configured with the same settings to train the model.
Trainer (Classification) Snap Configuration |
Trainer (Classification) Snap Output |
|
|
Note: The model generated by the Trainer (Classification) Snap is written into a file using the File Writer Snap, which is configured as shown below. This model can then be used to predict the Balance Class for an unlabeled dataset.
To successfully reuse pipelines:
- Download and import the pipeline in to the SnapLogic Platform.
- Configure Snap accounts, as applicable.
- Provide pipeline parameters, as applicable.