Masking sensitive fields in a dataset using the Mask Snap

This example demonstrates how to use the Mask Snap to hide sensitive fields in a dataset before passing it to an external party.

The pipeline uses the Mask Snap to anonymize or remove sensitive data from a demographic dataset of Oscar award winners. Though the dataset is public, this example simulates scenarios where organizations need to mask confidential data before external sharing.

Download this pipeline.

Read and parse the input CSV dataset.

The input dataset is a demographic of Best Director Oscar award winners from 1927–1976. The File Reader Snap reads the CSV file, which is parsed using the CSV Parser Snap.
Configure the Mask Snap to apply masking rules.
The Snap is configured with three policies to mask the following fields:
- $date_of_birth: Replaced with the first day of the year using Start of Year mask method.
- $bio_url: Deleted using a Recursive search and a regex match for HTTP/HTTPS URLs.
- $person: Replaced with static text "Winner name is masked".
The Mask Snap uses different search modes for different use cases:
- Exact Path for $date_of_birth and $person, where fields are known and not nested.
- Recursive mode for $bio_url, when field location is uncertain or nested.
Review the masked dataset in the output preview.
The Snap applies all configured masking rules. The result is:
- $date_of_birth is replaced with the first day of the birth year (e.g., 1895-01-01).
- $bio_url field is removed from the output entirely.
- $person field is replaced with "Winner name is masked".
Convert the masked data back to CSV format and write to file.

The output from the Mask Snap is passed to the CSV Formatter Snap for formatting and then written using the File Writer Snap.

To successfully reuse pipelines:

Download and import the pipeline in to the SnapLogic Platform.
Configure Snap accounts, as applicable.
Provide pipeline parameters, as applicable.