S3 File Reader
Overview
The S3 File Reader Snap reads data from an S3 bucket. When you provide a Version ID, the Snap reads a specific version of an S3 file object.
We plan to introduce additional S3 features exclusively in Amazon S3 Snaps, while Binary Snaps with S3 support will not contain these updates. Therefore, we recommend you to use the Amazon S3 Snap Pack for all your S3 operations within your pipelines. However, Binary Snaps will be retained as is to maintain backward compatibility, but be aware that we will no longer provide S3 support for the Binary Snaps. Learn more: Migrate from Binary to S3 Snaps.

Read-type Snap
Works in Ultra Tasks
Prerequisites
- IAM Roles for Amazon EC2.
- The IAM_CREDENTIAL_FOR_S3 feature is to access S3 files from Groundplex nodes hosted in the EC2 environment. No Access-key ID and Secret key in the AWS S3 account is needed.
- The IAM credential stored in the EC2 metadata provides access rights to the S3 buckets.
- IAM role is supported only in the Groundplex nodes hosted in the EC2 environment.
- The IAM Role stored on the EC2 instance requires List, Read, and Write permissions.
- S3 account validation is not supported when you enable the IAM role property.
Learn more about IAM Roles for Amazon EC2.
- Open Manager.
- Open the Snaplexes tab of the project that contains the EC2-based Groundplex.
- Click the Groundplex to open its Properties.
- Open the Node Properties tab.
- Click + to add a new row in the Global properties section.
- Enter
jvm_optionsin Key and-DIAM_CREDENTIAL_FOR_S3=TRUEin Value.
- Restart the JCC (node).
Limitations
The current Snap functionality supports AWS S3 Cloud Service and applies to the AWSGovCloud setup.
Snap views
| View | Description | Examples of upstream and downstream Snaps |
|---|---|---|
| Input | An upstream Snap is optional and any Snap with a document output view can be connected upstream (such as Mapper, File Writer , and so on). Any document with key-value pairs to evaluate expression properties in the S3 File Reader Snap. Each input document, if any, will cause one read operation of the Snap. | |
| Output | Any Snap with a binary input view can be connected downstream, such as
CSV Parser
, JSON Parser,
XML Parser
and so on. Binary data read
from AWS S3 specified in the File property with header information about the binary
stream. The binary data and header information can be previewed at the output of the
Snap. Example of binary data and header information {
"content-length": "96258" "last-modified": { "_snaptype_datetime":
"2014-06-26T23:27:01.000 UTC"} "content-disposition": "attachment;
filename="leads.csv"" "content-location": "s3:///mr_test/leads.csv"
"content-type": "text/csv" "etag": "730145bec198288e9f428193fde851b7"
} |
|
| Error |
Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
|
Snap settings
- Expression icon (
): Allows using pipeline parameters to set field values dynamically (if enabled). SnapLogic Expressions are not supported. If disabled, you can provide a static value.
- SnapGPT (
): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
- Suggestion icon (
): Populates a list of values dynamically based on your Snap configuration. You can select only one attribute at a time using the icon. Type into the field if it supports a comma-separated list of values.
- Upload
: Uploads files. Learn more.
| Field / Field set | Type | Description |
|---|---|---|
| Label | String | Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline. Default value: S3 File Reader Example: S3 File Reader |
| File | String/Expression/ Suggestion | Required. Specify the URL for the S3 file,
from where the binary data is to be read. It must start with "s3:///". The suggest
feature can be used to view the list of buckets, subdirectories and files. Bucket
names are suggested if the property is empty or "s3:///". Once a bucket is selected,
it can list subdirectories and files immediately below the bucket. Names of
subdirectories end with a forward slash ("/"). The suggest feature is not supported
if the properties in the S3 Dynamic account are parameters. This Snap supports S3
Virtual Private Cloud (VPC) endpoint. Warning: Prerequisite
The provided account must have 'read' access to the specified S3 bucket in order to read the file successfully. Note:
Using Expressions: Click the Expression Enabler to enable the expressions. For example, if the File property is "s3:///mybucket/out_" + Date.now() + ".csv" then the evaluated filename is s3:///mybucket/out_2013-11-13T00:22:31.880Z.csv. Syntax: For region names and their details, see AWS Regions and Endpoints.Warning: Region
Name Refer to Acceptable File
PathsRegion name is optional only if the region is
See AWS Regions and Endpoints for details. Warning: Lint Warning The Snap displays a
Lint Warning in your Pipeline in the following scenarios:
Therefore, we recommend that you confirm to any of the acceptable
relative paths. Else, use an absolute path—that is provide a file path that
belongs to the same org where you want to write the file, or click on the File
Upload Default value: s3:/// Example:
|
| Suggest fully-qualified file names | Checkbox | When selected, includes the region and authority of the S3 bucket in the file
paths shown in the suggestion list. Default status: Deselected |
| Version ID | String/Expression/ Suggestion | Enter or select S3 file version ID. If the property is empty, the Snap reads
the latest version. The suggest feature can be used to view the list of version IDs
for the S3 file in the File property. The suggest feature is not supported if
the properties in the S3 Dynamic account are parameterized. Each line in the
suggested list also includes the last modified date and the file size to help select
a version. When the property value is entered manually, only the version ID is
required. The Snap ignores the last modified date and size information of a version
when it reads the file. If the versioning of a S3 bucket is not enabled, no version
ID is suggested. The versions of the following cases will be omitted in the
suggested list since their files cannot be downloaded:
Default value: None. Examples: xvcnB8gPi37l3hbOzlsRFxjVwQ.numQz |
| Version ID suggestion interval | Use this field set to read a specific version of S3 file object. Enter the time interval for the Version ID suggestion. Enter two rows to provide a start date and an end dates. If only one row is provided, the interval will be from the date until now. If left empty, all version IDs are suggested. This property may be useful when a specific S3 file has many versions. This property is used for the Version ID suggestion only, and not used during the Snap preview or execution. | |
| Year | Integer | Enter the year as a 4-digit integer. Default value: None. Example: 2017 |
| Month | Integer | Enter the month as an integer. Default value: None. Example: 9, 09, 12 |
| Date | Integer | Enter the day of the month. Default value: None. Example: 28, 09, 12 |
| Zone | String/Suggestion | Enter or select a time zone ID from the suggested list. May be empty for UTC.
Please note only zone IDs in the suggested list are supported. Default value: None. Example: US/Pacific |
| Enable staging | Checkbox | If selected, the Snap downloads the source file into a local temporary file.
When the download is completed, it streams the data from the temporary file to the
output view. This property prevents the Snap from being blocked by slow downstream
pipeline. The local disk should have sufficient free space as large as the expected
file size. Warning: Some Snaps may take a long time to process large
amounts of data. This, in turn, could lead to connection timeouts, causing the
pipeline to fail. Selecting this property saves the data on your local disk,
enabling you to avoid such timeouts. Default status: Deselected |
| Number of retries | Integer/Expression | Specify the maximum number of retry attempts that the Snap must make in case
there is a network failure, and the Snap is unable to read the target file. If the value is larger than 0, the Snap overrides the Enable staging value to true and downloads the S3 file to a temporary local file. If any error occurs during the download, the Snap waits for the time specified in the Retry interval and attempts to download the file again from the beginning. When the download is successful, the Snap starts to stream the data from the temporary file to the downstream Pipeline. All temporary local files are deleted when they are no longer needed. Note: Ensure that the local drive has sufficient free disk space to
store the temporary local file. Minimum value: 1 Default value: 0 Example: 3 |
| Get Object Tags | Checkbox | Select this checkbox to include object tags in the header of the output binary
data. See Object Tagging for more information on
object tags. You must have the Default status: Deselected |
| Snap execution | Dropdown list |
Choose one of the three modes in
which the Snap executes. Available options are:
Default value: Execute only Example: Validate & Execute |
Acceptable File Paths
- Relative paths
filename.json: Saves the file in the project.../shared/filename.json: Saves the file in the Project Shared Space.../../shared/filename.json: Saves the files in the Org Shared project.
- Absolute path
-
/<org>/<projectSpace>/<project>/filename.json
-
Optional Configuration
This Snap uses account references created on the Accounts page of SnapLogic Manager to handle access to this endpoint. See Configuring Binary accounts for information on setting up accounts that work with this Snap.
- AWS S3 - Access-key ID, Secret key, Security token.
- S3 Dynamic - Access-key ID, Secret key, Security token, Server-side encryption.