Query Redshift Tables with WHERE Clause

Overview

This example demonstrates how to use the Redshift Select Snap to query data from Redshift tables with filtering conditions using the WHERE clause. The Snap supports both static conditions and dynamic parameter substitution.

Prerequisites

Amazon Redshift cluster with appropriate access
Valid Redshift account configured in SnapLogic
Target table with data to query

Pipeline Overview

The pipeline uses the Redshift Select Snap to fetch data from a database table by specifying the table name, WHERE clause conditions, and ordering preferences. The Snap produces records from the database which can then be processed by downstream Snaps.

Basic Configuration

Redshift Select Snap - Settings:

Schema name: public
Table name: account
Where clause: Filter condition (e.g., status = 'active')
Order by: Column names for sorting results
Limit rows: Maximum number of rows to return

Configuration Approaches

Without Expressions:

Directly specify filter conditions in the WHERE clause property. This approach is suitable for static queries where filter values don't change.

WHERE clause: name = 'Tom'
Order by: inst_dt

This configuration is equivalent to:

SELECT * FROM "public"."demo_guest" WHERE "name" = 'Tom' ORDER BY "inst_dt";

With Expressions - Upstream Parameters

Use a Mapper Snap to provide dynamic filter values from upstream data.

Configure the Select Snap to use the upstream values.

With Expressions - Pipeline Parameters

Use pipeline parameters for reusable queries across different executions.

Query Features

The Redshift Select Snap supports various SQL features:

WHERE clause: Filter records with conditions (supports JSON path substitution)
ORDER BY: Sort results by one or more columns
LIMIT: Restrict the number of rows returned
OFFSET: Skip a specified number of rows before returning results
Output fields: Select specific columns instead of all columns

JSON Path Substitution

The Snap supports JSON path substitution in WHERE clauses:

Use $fieldname to reference values from input documents
Values are automatically substituted when the query executes
Documents missing required values are routed to the error view

Example: WHERE email = $email

Note:

JSON path substitution is not supported after IS or is keywords. Use explicit values like IS NULL or IS NOT NULL instead.

Use Cases

This pattern is useful for:

Extracting filtered datasets from Redshift tables
Implementing parameterized queries with dynamic filters
Retrieving sorted and paginated results
Building data extraction pipelines with configurable criteria
Creating lookup operations with specific conditions

Best Practices

Use indexes on columns referenced in WHERE clauses for better performance
Specify output fields explicitly when you don't need all columns
Use LIMIT and OFFSET for pagination to control memory usage
Enable the error view to handle documents with missing filter values
Test WHERE clause expressions before deploying to production
Use pipeline parameters for reusable query templates