Data Catalog Partition Overwrite Pipeline

The following Pipeline demonstrates how to overwrite an existing partition by using the three Data Catalog Service Snaps: Catalog Query , Catalog Delete , and Catalog Insert . Because you cannot edit the Table Assets in Manager, you can use this Pipeline as a way for overwriting partitions in the Data Catalog.



Download this Pipeline
  1. The Catalog Query Snap queries for a partition, age, in the Asset named table1.


  2. The Catalog Delete Snap deletes the partition, PK_1, indicated by the Key Name and Key Value.


  3. To replace this metadata, the JSON Generator Snap is used to generate a document with the data for name, age, and height.


  4. The JSON Snap passes the document data to the Parquet Writer, which writes the JSON document to a S3 bucket.


  5. The Catalog Insert Snap populates table1 with the new data.


  6. You can navigate to Manager > Projects > Assets to view the metadata inserted in table1.


  7. In Manager, click table1, and click Show Table Schema.


To successfully reuse pipelines:
  1. Download and import the pipeline into SnapLogic.
  2. Configure Snap accounts as applicable.
  3. Provide pipeline parameters as applicable.