Common Words

Overview

You can use this Snap to identify the most common words in the input dataset and compute the frequency with which they occur. It also offers you the ability to specify the number of most common words you want to include in the output. The input of this Snap must be an array of tokens, which can be generated by the Tokenizer Snap.

Important: The output of this Snap must be connected to the second input view of the Bag of Words Snap.

Common Words Snap

Prerequisites

None.

Limitations and known issues

None.

Snap views

View Description Examples of upstream and downstream Snaps
Input This Snap has at the most one input output view and it requires a document containing the array of tokens.
Output This Snap has at the most one document output view and it requires a document containing the most common words in the input dataset, including their frequency details.
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when the Snap encounters an error.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Note:
  • Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
  • Expression icon (): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
  • Add icon (Plus Icon): Indicates that you can add fields in the field set.
  • Remove icon (Minus Icon): Indicates that you can remove fields from the field set.
Field / Field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: Common Words

Example: Create customer support chatbots
Token array String/Suggestion

Required. Specify the array of tokens. Alternatively, you can click the Suggestions icon () to view a list of values and select a value.

Default value: N/A

Example: $text

Top words limit Integer/Expression

Required. Specify the maximum number of most common words to be included in the output.

Default value: 100

Example: 200

Snap execution Dropdown list
Select one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute. Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only. Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled. Disables the Snap and all Snaps that are downstream from it.

Default value: Validate & Execute

Example: Execute only

Examples