Create and query an external table in Athena

This example pipeline demonstrates how to create an external table in Athena for structured data stored in Amazon S3 and then query it.


Athena Example Pipeline

Download this pipeline.
  1. Configure the Athena Query Snap to create an external table in Athena that references structured JSON data stored in Amazon S3.
    CREATE EXTERNAL TABLE IF NOT EXISTS employees_info (
      EmployeeID INT,
      Name STRING,
      Age INT,
      Salary DOUBLE,
      IsManager BOOLEAN,
      JoinDate DATE,
      LastLogin TIMESTAMP,
      Skills ARRAY<STRING>,
      Address STRUCT<
        Street: STRING,
        City: STRING,
        ZipCode: INT
      >,
      PerformanceRatings ARRAY<INT>
    )
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    LOCATION 's3://athena-bucket/employees/JSON/'
    TBLPROPERTIES ('has_encrypted_data'='false');

    On validation, the Snap successfully executes the query and displays a success message.

    Athena Query Snap configuration Athena Query Snap output

    Athena Query Snap configuration


    Athena Query Snap output

  2. Configure the Athena Select Snap to retrieve records from the newly created employees_info table.

    On validation, the Snap displays the successfully retrieved matching records.

    Athena Select Snap configuration Athena Select Snap output

    Athena Select Snap configuration


    Athena Select Snap output

  3. You can configure the Mapper Snap to extract only relevant data for further use.
    On validation, the Snap displays the output based on the specified mappings.
    Mapper Snap

To successfully reuse pipelines:
  1. Download and import the pipeline in to the SnapLogic Platform.
  2. Configure Snap accounts, as applicable.
  3. Provide pipeline parameters, as applicable.