Athena Select

Overview

You can use this Snap to retrieve data from Amazon Athena by executing SQL queries on the specified data stored in Amazon S3. The Snap supports executing SELECT statements and fetching results based on provided filters and conditions.


Athena Select Overview

Prerequisites

None.

Known issues

  • ARRAY, STRUCT, and MAP data types are not supported in SELECT queries. If used, it may result in incorrect values in the output documents.
  • This Snap can fail with a connection error if the account is configured with an IAM role, and, Role ARN.
  • This Snap uses the Athena JDBC driver to retrieve data. When using a Cross-account IAM role, the JDBC connection expires based on the role's duration. Large datasets requiring more time than this duration are not supported.
  • CSE_KMS encryption is not supported because of an issue in the Athena JDBC driver.

    Workaround: Use the Athena Query Snap with a SELECT query.

  • The second output view of the Snap cannot connect directly to the second input view of Athena Bulk Load Snap because Athena does not provide complete table metadata for all data types.
  • This Snap can fail on a table created with ORC files in LZO compression because of an Athena service defect.

Snap views

View Description Examples of upstream and downstream Snaps
Input This Snap has one document input view.
Output This Snap has two document output view.
Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:

  • Stop Pipeline Execution Stops the current pipeline execution when an error occurs.
  • Discard Error Data and Continue Ignores the error, discards that record, and continues with the remaining records.
  • Route Error Data to Error View Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap settings

Legend:
  • Expression icon (): JavaScript syntax to access SnapLogic Expressions to set field values dynamically (if enabled). If disabled, you can provide a static value. Learn more.
  • SnapGPT (): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
  • Suggestion icon (): Populates a list of values dynamically based on your Account configuration.
  • Upload : Uploads files. Learn more.
Learn more about the icons in the Snap settings dialog.
Field / field set Type Description
Label String

Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline.

Default value: Athena Select

Example: Athena Select data
Database String/Expression/Suggestion

Required. Specify the name of Athena database, that is a logical grouping of Athena tables.

Default value: N/A

Example: mydatabase
Table String/Expression/Suggestion

Required. Specify the name of the table in the Athena database for the SELECT query to retrieve data from.

Default value: N/A

Example: Sales_db
Output fields Use this field set to define the column names in the SELECT query.
Tip: You can leave this field set empty to select all columns in the table.
Column String/Expression/Suggestion

Specify the name of the column to include in the query result.

Default value: N/A

Example: employeeid
Where clause String/Expression/Suggestion

Specify the SQL WHERE clause to filter the records returned by the query.

Default value: N/A

Example: salary > 50000
Group by

Use this field set to define the GROUP BY clause.

Column String/Expression/Suggestion Specify the column name for the GROUP BY clause.

Default value: N/A

Example: Column1
Order by Use this field set to define the ORDER BY clause.
Column String/Expression/Suggestion Specify the list of columns for ORDER BY clause.

Default value: N/A

Example: Column2
Limit offset Integer/Expression

Specify the number of rows to skip before starting to return rows in the query result.

Default value: N/A

Example: 1000
Limit rows Integer/Expression

Specify the maximum number of rows to return from the query result.

Default value: N/A

Example: 100

Additional connection parameters Use this field set to define the connection parameters in addition to the ones provided in Amazon Athena Account.
Parameter name String/Expression Specify the name of the parameter.

Default value: N/A

Example: Region

Value String/Expression Specify the value of the parameter.

Default value: N/A

Example: us-west-2

Client settings Use this field set to define the client settings.
Maximum retries Integer/Expression Required. Specify the maximum number of retry attempts.

Default value: 3

Example: 100

Fetch size Integer/Expression Required. Specify the number of rows to fetch per batch when retrieving the query result.
Tip: Higher value can improve performance but may use more memory.

Default value: 10000

Example: 50000

Snap execution Dropdown list
Select one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Default value: Validate & Execute

Default value: Execute