Redshift - Update
Overview
This Snap executes a SQL update with the given properties. Keys from the document will be used as the columns to update and their values will be the updated value in the column.
ETL Transformations & Data Flow
This snap is a data source. It works by performing a standard JDBC DatabaseMetaData#getTables() query for the tables within the database. The Schema name value is used to populate the schemaPattern parameter in that query.
This snap does not require any temporary files or other external resources.
- This is a Write-type Snap.
Works in Ultra Tasks
Prerequisites
A valid Redshift Account with the required permissions.
Limitations
If you use the PostgreSQL driver (org.postgresql.Driver) with the Redshift Snap Pack, it could result in errors if the data type provided to the Snap does not match the data type in the Redshift table schema. Either use the Redshift driver (com.amazon.redshift.jdbc42.Driver) or use the correct data type in the input document to resolve these errors.
Snap views
| Type | Description | Examples of upstream and downstream Snaps |
|---|---|---|
| Input |
This Snap has at most one document input view. The input document contains map data with each document mapped to a single record by matching document field names with table column names. |
|
| Output |
This Snap has at most one document output view. |
|
| Learn more about Error handling. | ||
Examples
- Update Records with Conditional Expressions: Update records with conditional expressions and logic
Snap settings
| Field/Field set | Description | ||
|---|---|---|---|
|
Label
|
Required. Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline. Default value: Redshift - Update Example: Redshift - Update |
||
|
Schema name
|
Specify the database schema name. Leave this field blank to use the default schema. Default value: N/A Example: public Warning: The values can be passed using the pipeline parameters but not the upstream parameter.
|
||
|
Table name
|
Required. Specify the name of the table to update. Default value: N/A Example: employees Warning: The values can be passed using the pipeline parameters but not the upstream parameter.
|
||
|
Update condition
|
Specify the SQL WHERE clause of the update statement. You can define specific values or columns to update (Set condition) in the upstream Snap, such as Mapper Snap, and then use the WHERE clause to apply these conditions on the columns sourced from the upstream Snap. For instance, here is a sample of an Update SQL query:
If the Update Condition field is left blank, the condition is applied on all the records of the target table. In certain scenarios where you want to use specific data from the upstream Snaps, and do not want to change that data in the Update Snap, then you need to place the data in the original structure of the input document. Hence, we recommend that you use the format for Update condition and input data format as shown below:
Important:
Refer to the example to understand how to use the Update Condition. Default value: N/A Example: Without using expressions
Using expressions
CAUTION: Using expressions that join strings together to create SQL queries or conditions has a potential SQL injection risk and is hence unsafe. Ensure that you understand all implications and risks involved before using concatenation of strings with '=' Expression enabled.
|
||
|
Number of retries
|
Specify the maximum number of retry attempts the Snap must make in case of network failure. Default value: 0 Example: 3 |
||
|
Retry interval (seconds)
|
Specify the time interval between two successive retry requests. A retry happens only when the previous attempt resulted in an exception. Default value: 1 Example: 10 |
||
|
Snap execution
|
Select one of the three modes in which the Snap executes. Available options are:
Default value: Execute only Example: Validate & Execute |
||
In a scenario where the Auto commit on the account is set to true, and the downstream Snap does depends on the data processed on an Upstream Database Bulk Load Snap, use the Script Snap to add delay for the data to be available.
For example, when performing a create, insert and a delete function sequentially on a pipeline, using a Script Snap helps in creating a delay between the insert and delete function or otherwise it may turn out that the delete function is triggered even before inserting the records on the table.
Troubleshooting
| Error | Reason | Resolution |
|---|---|---|
| type "e" does not exist | This issue occurs due to incompatibilities with the recent upgrade in the Postgres JDBC drivers. | Download the latest 4.1 Amazon Redshift driver here and use this driver in your Redshift Account configuration and retry running the Pipeline. |