Group By Fields

Overview

The Snap groups input documents by the field values into batches of output documents. Each batch is an output document with a list of input Map data as a value at the location specified by the Target field property. Input documents with the same group-by field values are grouped into the same output document.

Note: The Snap expects the input documents with the same group-by field values to be contiguous and whenever the group-by field values change, the Snap produces a new output document. Therefore, if all input documents with the same group-by field values are expected to be grouped into one output document, the Sort Snap can be used in front of the Group By Fields Snap so that the input document stream are sorted by the group-by field values.


Prerequisites

All input documents should be of Map data type and contain values specified by the Fields property.

Snap views

View Description Examples of upstream and downstream Snaps
Input This Snap has exactly one document input view.

A document with a Map data.

Output This Snap has exactly one document output view. The Snap is configured with a second output view to get statistics of the input data.

A document with a list of input Map data as a value at the location specified by the Target field

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when an error occurs.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Legend:
  • Expression icon (): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field / Field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: Group By Fields

Example: Group By Fields
Fields

Required. The fields to group by.

Memory Sensitivity Dropdown list

Indicates the Snap's behavior towards memory changes. Choose one of the available options:

  • None: If selected, it groups input documents by the field values into batches of output document.
  • Dynamic: If selected, groups may be split into multiple parts, depending on memory availability. The group size to scale against each group is determined statistically from the groups already processed (mean group size + one standard deviation)

Default value: None

Example: Dynamic

Min.Part Size Integer/Expression

Activated when Memory Sensitivity is set to Dynamic.

Enter the minimum part size that you want Snap to split larger groups into multiple parts.
Note: This limit does not apply to the last part of the multi-part group or a single part of the group that's smaller than the size of the part mentioned here.

Default value: 10

Example: 100
Target field String/Suggestion

Required. Target field name to be used as a key in the output document or a JSON path where a list of input Map data would be located.

Default value: group

Example: batch
Minimum memory (MB) Integer/Expression

If the available memory is less than this property value while processing input documents, the Snap stops to fetch the next input document until more memory is available. This feature is disabled if this property value is 0.

Default value: 750

Example: 500
Out-of-memory timeout (minutes) Integer/Expression

If the Snap pauses longer than this property value while waiting for more memory available, it throws an exception to prevent the system from running out of memory.

Default value: 20

Example: 30
Snap execution Dropdown list
Choose one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Execute only

Example: Validate & Execute

Examples