OpenSearch Query

Overview

You can use this Snap to execute a query on the specified OpenSearch index.


OpenSearch Query Overview

Prerequisites

  • A valid account with the required permissions.

Limitations and Known Issues

None.

Snap Views

View Description Examples of Upstream and Downstream Snaps
Input This Snap has at the most one document input view. The input requires the OpenSearch index and vector name. Mapper
Output This Snap has at the most one document output view. The Snap retrieves the top matching vectors close to a specific vector in the OpenSearch index and outputs the corresponding mapping for those matching vectors (if required). Mapper
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when the Snap encounters an error.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

Note:
  • Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
  • Expression icon (): Indicates whether the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
  • Add icon (): Indicates that you can add fields in the field set.
  • Remove icon (): Indicates that you can remove fields from the field set.
Field / Field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: OpenSearch Query

Example: TextSimilarityDocument
Index name String/Expression/Suggestion

Required. Specify the name of the OpenSearch index from which you want to query records.

Default value: N/A

Example: document_embeddings
Vector field name String/Expression/Suggestion Specify the name of the vector field.
Note: Index name must be defined to populate the suggestions in this field.

Default value: N/A

Example: embedding_vector
Top K String/Expression

Required. Specify the maximum number of matches to retrieve per query result. This field processes samples exclusively from the top k options for each subsequent token.

Default value: 4

Example: 5
Include vector Checkbox/Expression
Select this checkbox to include the vectors in the response.
Note: You can use the expression enabler to fetch values from pipeline parameters that evaluate to either true or false.

Default status: Deselected

Include ID Checkbox/Expression
Select this checkbox to include the ID in the response.
Note: You can use the expression enabler to fetch values from pipeline parameters that evaluate to either true or false.

Default status: Selected

Include score Checkbox/Expression
Select this checkbox to include the score in the response.
Note: You can use the expression enabler to fetch values from pipeline parameters that evaluate to either true or false.

Default status: Selected

Search method and space type Choose your k-NN Search method and space type configuration.
Search method Dropdown list Required. Choose the method for obtaining the k-nearest neighbors from an index of vectors. Available options include:
  • Approximate k-NN: Uses approximate nearest neighbor algorithms to find the k-nearest neighbors to a query vector, prioritizing lower latency and scalability over exact accuracy. Learn more about Approximate k-NN search.
  • Script Score k-NN: Extends OpenSearch's script scoring to execute an exact k-NN search on specific fields, suitable for smaller datasets or when pre-filtering is required. Learn more about Exact k-NN with scoring script.
  • Painless extensions: Adds distance functions as painless extensions for complex combinations in exact k-NN searches, providing customization options for scoring and pre-filtering. Learn more about Painless scripting functions.

Default value: Approximate k-NN

Example: Painless Extension k-NN

Painless extension k-NN function type Dropdown list

Appears when you select Painless extension k-NN for the Search method.

Required. Choose the painless function type based on your data characteristics and search requirements. Available options include:
  • l1Norm: Calculates the L1 distance between a query vector and document vectors, inverted for relevance ranking.
  • l2Squared: Calculates the square of the L2 distance (Euclidean distance) between a query vector and document vectors, inverted for relevance ranking.
  • cosineSimilarity: Measures the cosine similarity between a query vector and document vectors, normalized to [-1, 1] and adjusted for positive scores in information retrieval contexts.

Learn more about Function types.

Default value: l2Squared

Example: cosineSimilarity

Script score k-NN space type Dropdown list

Appears when you select Script score k-NN for the Search method.

Required. Choose the space type based on your data characteristics and search requirements. Available options include:
  • l1: Measures the L1 (Manhattan) sum of absolute differences between vector components, giving equal weight to all components.
  • l2: Calculates the L2 (Euclidean) shortest distance between points in a space, where each component is squared and then summed.
  • linf: Calculates the L-infinity (Chebyshev) largest magnitude among vector elements, focusing on the maximum difference between components.
  • cosinesimil: Measures cosine similarity between vectors.
  • innerproduct: Computes the inner product between vectors. This option supports Lucene in OpenSearch version 2.13 and later.
  • hammingbit: Calculates the Hamming distance between binary vectors.

Learn more about Space type.

Default value: l1

Example: innerproduct

Snap execution Dropdown list
Select one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute. Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only. Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled. Disables the Snap and all Snaps that are downstream from it.

Default value: Validate & Execute

Example: Execute only