Sort input data and perform aggregate functions

This example pipeline demonstrates how to pre-sort the records with the Sort Snap and then perform the aggregate functions on each group using the Aggregate Snap.



Download this pipeline.
  1. Configure the Sequence Snap and Mapper Snap to produce a large set of unsorted input documents to be processed by Aggregate Snap.
  2. Configure the Sort Snap to sort the group values in an ascending order.


  3. Configure the Aggregate Snap as shown below.
    Aggregate Snap Configuration Description


    • The input documents for the Aggregate Snap contains two fields: value and group.
    • The Snap is configured to perform the AVG, COUNT, MIN, and MAX functions on the presorted values in an ascending order.
    • The GROUP-BY fields property contains $group field, which means the Snap groups input documents as per the $group value.
    • The Sorted streams is set to Ascending because the input documents are presorted in an ascending order.
    Note: If the input documents are not in an ascending order, the Snap displays an exception.
  4. On executing the pipeline, the Sort Snap completes the sorting and the Aggregate Snap the aggregate functions. You can view the execution details in the Pipeline Execution Statistics.


    Important:

    If the input documents are unsorted and GROUP-BY fields are used, you must use the Sort Snap upstream of the Aggregate Snap to presort the input document stream and set the Sorted stream field Ascending or Descending to prevent the out-of-memory error. For more information, refer to the attached example pipeline.

To successfully reuse pipelines:
  1. Download and import the pipeline in to the SnapLogic Platform.
  2. Configure Snap accounts, as applicable.
  3. Provide pipeline parameters, as applicable.