CSV file as a source
You can use comma-delimited CSV files as a source for a data pipeline. AutoSync uploads the files to temporary storage, loads them to the destination, and deletes them from temporary storage. Each source file loads into a target table of the same name.
STRING
NUMBER
BOOLEAN
TIMESTAMP
AutoSync supports the Full load and Append load types for CSV files. The first run of the data pipeline uses Full load. You can change the load type to Append by editing the data pipeline and selecting the Auto synchronize tab. Learn more about how AutoSync loads files and load types.
You cannot schedule synchronization for a data pipeline with CSV files as a source. To use the data pipeline again, select more files. The Upload CSV file action is available from the card, the three-dot menu, the list view, and the details panel.
Requirements
File names cannot include single or double quotes or a special character, such as #, $, or %. Files must not be empty. To load correctly, each file must:
- Have a header row.
- Be uncompressed.
- Be less than 100 MB in size.
- Use single commas as delimiters.
- Use quotes to enclose meaningful spaces and commas that are part of a field value. AutoSync trims leading and trailing spaces.
- Use quotes to enclose fields that include commas in the value.
For example, in a source that lists vehicles where the header specifies the
year
,make
,model
, anddescription
columns, the quotes preserve the comma in the description column:1997, Ford, E350, "Extended cab, luxury truck"
Transformations
A data pipeline using the CSV file endpoint doesn't save the file names after the data loads to the target. However, AutoSync does save any transformation configuration. You can use the same data pipeline to upload one or more files with the same name, and AutoSync applies any previously configured transformations. This is true whether you use the Upload CSV file link on the data pipeline card, or edit it — when you select files with the same name.
For example, you might create a data pipeline named CSV-employees
to
upload data from an employees.csv file to Snowflake. This file lists phone numbers that shouldn't be
exposed in the Snowflake table. When creating the data pipeline, remove the Phone
column in the Filter columns dialog to prevent AutoSync from loading it. Later, the HR
department hires more people and adds them to the employees.csv file.
- From the
CSV-employees
data pipeline card:- Click Upload CSV file.
- Select the updated employees.csv file.
- Click Save to upload the data without the phone numbers.
- Double-click the
CSV-employees
data pipeline to edit it:- In the Endpoints tab, select the updated employees.csv file.
- Click the Transformations tab and AutoSync prompts you to refresh the preview.
- Click the Refresh link in the message.
- The Removed columns list still contains the
Phone
column. - Click Save to load the updated file without the phone numbers.