Bag of Words
Overview
You can use this Snap to vectorizes sentences into a set of numeric fields. It takes two inputs:
- An array of tokenized words extracted from a set of input sentences. If your sentences are not already tokenized, use the Tokenizer Snap.
- A document containing the most common words in the training set. You can generate this document using the Common Words Snap.
The Snap then processes the received data and outputs a frequency count of the most common words in each sentence. For instance, consider the following scenario: The first input consists of token arrays representing sentences in a dataset, and the second input contains the frequency of the top 100 common words. This Snap enables you to list the number of times each common word appears in each input sentence.
- Transform-type Snap
- Works in Ultra Tasks
Prerequisites
None.
Limitations and known issues
None.
Snap views
View | Description | Examples of upstream and downstream Snaps |
---|---|---|
Input #1 | This Snap has at the most two document input views and it requires an array of tokenized words drawn from a set of input sentences. If your sentences are not already tokenized, use the Tokenizer Snap. | |
Input #2 | A document containing the most common words in the training set. You can generate this document with the Common Words Snap. |
|
Output | This Snap has at the most one document output view. A document containing the frequency of the most common words in the text field. | |
Error |
Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Snap settings
- Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
- Expression icon (): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
- Add icon (): Indicates that you can add fields in the field set.
- Remove icon (): Indicates that you can remove fields from the field set.
Field / Field set | Type | Description |
---|---|---|
Label | String |
Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline. Default value: Bag of words Example: Customer review |
Token array | String/Suggestion | Required. Specify the array of tokens. Alternatively, you can click the Suggestions icon () to view a list of values and select a value. Default value: N/A Example: $text |
Snap execution | Dropdown list |
Select one of the three modes in which the Snap executes.
Available options are:
Default value: Validate & Execute Example: Execute only |