Load from files

SnapLogic® AutoSync can load CSV and JSON files:

  • The CSV file endpoint loads files from a network file system.
  • The Amazon S3 endpoint loads CSV or JSON files from an Amazon S3 bucket.

The two endpoints have different features and constraints:

  • The CSV file endpoint allows you to select up to 100 files, each of which must be 100 MB in size or less. The data pipeline runs when you save it. You can append data by changing the load type and uploading a file with the same name.
  • The Amazon S3 endpoint allows you to select either CSV or JSON files. Each data pipeline can load multiple files of the same format. AutoSync does not load the files until you run the integration. When the data pipeline runs, AutoSync performs a full load. You can schedule synchronization, but be aware that a second load of a file with the same name overwrites the data in that table.

This page describes how AutoSync loads data from files. Refer to How AutoSync loads data for information on other sources.

CSV file endpoint

On the first run, AutoSync creates a table for each CSV file. Columns are set to the destination's default string type. The default load type is Full load

Change the load type to Append to support these use cases:

  • A file exceeds the 100 MB size limit and you want all data in the same destination table. Split the file into multiple files with the same name and upload them one at a time.
  • To upload data from multiple files into the same destination table, use the same file name and upload them one at a time.

S3 endpoint

For each source file, AutoSync:

  • Creates a table in the destination with the same name as the file.
  • Sets the destination column types:
    • For JSON files, by inferring a basic set of data types from source values.
    • For CSV files, by using the destination's default string type for all columns.
  • Performs a Full load. If a table with the same name as the file exists in the destination, AutoSync removes table data and loads values from the file referenced by the data pipeline. AutoSync only supports Full load for files loaded from S3.

Full load for file sources

Full load supports two use cases. One is to upload new data without overwriting and the other is to overwrite stale data:

  • To upload new data without overwriting existing data:

    Use a unique file name each time. For example, a tool might generate a weekly report named weekly-report.csv. To avoid having a data pipeline overwrite last week's report, append a date to the name before uploading the file. Each report loads in a new table. In contrast, if you use the Append load type and do not change the file name, subsequent runs load to the original table.

  • To overwrite existing data periodically:

    Use the same file name as the previously uploaded file. The new file must have the same structure. For example, a CSV file should have the same header and a JSON file should have the same fields and value types.