This pipeline demonstrates a typical cross validation exercise for a dataset before a model is trained to prediction the target field. The dataset is a record of various aspects of a building. The building's required heating load depends upon each of these aspects. The cross validation is to validate the model's ability to predict this heating load.
-
Configure the CSV Generator Snap to pass the input data.
Note: In this example, we use the CSV Generator Snap to create a dataset containing the following fields:
- Relative Compactness
- Surface Area
- Wall Area
- Roof Area
- Overall Height
- Orientation
- Glazing Area
- Glazing Area Distribution
- Heating Load
-
Configure the Cross Validator (Regression) Snap to perform K-fold Cross Validation.
The Cross Validator (Regression) Snap splits the dataset into training and test sets, which are used to evaluate the selected ML algorithm.
Cross Validator (Regression) Snap Configuration |
Cross Validator (Regression) Snap Output |
|
|
Note: You can optionally write the output to a file using the downstream File Writer Snap for storage or further analysis.
To successfully reuse pipelines:
- Download and import the pipeline in to the SnapLogic Platform.
- Configure Snap accounts, as applicable.
- Provide pipeline parameters, as applicable.