Using AutoPrep
Use the AutoPrep Snap to prepare data for analysis, reporting, and machine learning without writing expressions, SQL scripts, or Python code.
- In Designer, create a Pipeline to handle the data you want to transform. The Snap that precedes AutoPrep must have a document output, not binary.
- Add the AutoPrep Snap.
- Click the AutoPrep Snap to open it.
The AutoPrep interface
The left pane has three tabs with controls for preparing data and the right pane displays the data preview in table format:

The AutoPrep interface provides the following elements and controls:
- AutoPrep: Click the name to rename the AutoPrep Snap.
- Manage fields tab:
- Flatten Structure tree: Search for fields, and select leaf nodes to flatten them to the root level.
- Select fields control: Remove fields and change data types.
- Handle nulls tab: Displays default rules for handling null or empty values. View and modify null handling rules.
- Review summary tab: Lists the applied changes. Undo or modify changes.
- Preview data pane: Displays a preview of the data set in a table. Use the toggle to expand the pane or to return to the original size. As you apply changes, the pane updates, but the transformations are not saved until you click Done.
Buttons
The buttons provide the following functionality:
- Update saves your changes, updates the Preview Data pane, and validates the Pipeline.
- Done saves your changes and exits AutoPrep. AutoPrep automatically generates the expressions necessary to accomplish the transformations at runtime.
- Cancel exits AutoPrep without saving the current changes.
Preview data pane
The Preview data pane has a control to expand to full size or return to the default. In the table, column headings include:
- The data type that AutoPrep calculated from the majority of values for that field.
- A menu with options. To open it, hover over the right side of the column header. The options are based on the field type and can include:
- Changing the data type
- Splitting the field into multiple fields, based on a delimiter
- Choosing the format for dates, currency, phone numbers, and country codes
- Enabling data masking
- Renaming the field (column names are case-sensitive)