Query Redshift Tables with WHERE Clause

Overview

This example demonstrates how to use the Redshift Select Snap to query data from Redshift tables with filtering conditions using the WHERE clause. The Snap supports both static conditions and dynamic parameter substitution.

Prerequisites

  • Amazon Redshift cluster with appropriate access
  • Valid Redshift account configured in SnapLogic
  • Target table with data to query

Pipeline Overview

The pipeline uses the Redshift Select Snap to fetch data from a database table by specifying the table name, WHERE clause conditions, and ordering preferences. The Snap produces records from the database which can then be processed by downstream Snaps.

Basic Configuration

Redshift Select Snap - Settings:

  • Schema name: public
  • Table name: account
  • Where clause: Filter condition (e.g., status = 'active')
  • Order by: Column names for sorting results
  • Limit rows: Maximum number of rows to return

Configuration Approaches

Without Expressions:

Directly specify filter conditions in the WHERE clause property. This approach is suitable for static queries where filter values don't change.

WHERE clause: name = 'Tom'
Order by: inst_dt

This configuration is equivalent to:

SELECT * FROM "public"."demo_guest" WHERE "name" = 'Tom' ORDER BY "inst_dt";

With Expressions - Upstream Parameters

Use a Mapper Snap to provide dynamic filter values from upstream data.

Configure the Select Snap to use the upstream values.

With Expressions - Pipeline Parameters

Use pipeline parameters for reusable queries across different executions.

Query Features

The Redshift Select Snap supports various SQL features:

  • WHERE clause: Filter records with conditions (supports JSON path substitution)
  • ORDER BY: Sort results by one or more columns
  • LIMIT: Restrict the number of rows returned
  • OFFSET: Skip a specified number of rows before returning results
  • Output fields: Select specific columns instead of all columns

JSON Path Substitution

The Snap supports JSON path substitution in WHERE clauses:

  • Use $fieldname to reference values from input documents
  • Values are automatically substituted when the query executes
  • Documents missing required values are routed to the error view

Example: WHERE email = $email

Note:

JSON path substitution is not supported after IS or is keywords. Use explicit values like IS NULL or IS NOT NULL instead.

Use Cases

This pattern is useful for:

  • Extracting filtered datasets from Redshift tables
  • Implementing parameterized queries with dynamic filters
  • Retrieving sorted and paginated results
  • Building data extraction pipelines with configurable criteria
  • Creating lookup operations with specific conditions

Best Practices

  • Use indexes on columns referenced in WHERE clauses for better performance
  • Specify output fields explicitly when you don't need all columns
  • Use LIMIT and OFFSET for pagination to control memory usage
  • Enable the error view to handle documents with missing filter values
  • Test WHERE clause expressions before deploying to production
  • Use pipeline parameters for reusable query templates