This example pipeline demonstrates how to use the HDFS ZipFile Writer Snap to zip and write a new file into HDFS, and then use the HDFS ZipFile Reader Snap to unzip and check the contents of the newly-created ZIP file.
Download this pipeline.
-
Use a Hadoop Directory Browser Snap to check the contents of the target directory before writing.
- Directory: Enter the HDFS directory path where you want to write the ZIP file.
- Filter: Use * to list all files in the directory.
This Snap outputs the initial list of files in the directory.
-
Generate a file for upload using a JSON Generator or File Reader Snap.
Create or read the file content that you want to zip and write to HDFS.
-
Configure the HDFS ZipFile Writer Snap to zip and write the file to HDFS.
- Directory: Enter the HDFS directory path where the ZIP file should be written.
- Filename: Specify the name for the ZIP file (for example, test.zip).
- Compression Level: Select the desired compression level.
The Snap zips the input file and writes it to the specified HDFS directory.
-
Use a second Hadoop Directory Browser Snap to verify the ZIP file was created.
Configure it with the same directory path to confirm the new ZIP file appears in the listing.
-
Configure the HDFS ZipFile Reader Snap to read and unzip the file.
- Directory: Enter the HDFS directory path containing the ZIP file.
- Filename: Specify the ZIP file to read (for example, test.zip).
- Filter: Optionally specify a filter to extract only certain files from the ZIP archive.
The Snap reads the ZIP file, extracts its contents, and outputs the unzipped file data.
On successful execution:
- The HDFS ZipFile Writer creates a compressed ZIP file in the target HDFS directory.
- The Hadoop Directory Browser confirms the ZIP file exists.
- The HDFS ZipFile Reader successfully extracts and outputs the file contents from the ZIP archive.
To successfully reuse pipelines:
- Download and import the pipeline into SnapLogic.
- Configure Snap accounts as applicable.
- Provide pipeline parameters as applicable.