Extract
Overview
You can use this Snap to extract text, tables, and figures from a binary PDF file.

Parse-type Snap
Works in Ultra Tasks
Prerequisites
Valid Adobe Account.Limitations and known issues
None.
Snap views
View | Description | Examples of upstream and downstream Snaps |
---|---|---|
Input | This Snap has at the most one binary input view. It requires a PDF file in binary format. | File Reader |
Output | This Snap has at the most one document output view. It returns the content
(structure) as a document and the extracted files (csv, png) as base64-encoded
strings. The output document contains:
|
|
Error |
Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab. The available options are:
Learn more about Error handling in Pipelines. |
Snap settings
- Expression icon (
): JavaScript syntax to access SnapLogic Expressions to set field values dynamically (if enabled). If disabled, you can provide a static value. Learn more.
- SnapGPT (
): Generates SnapLogic Expressions based on natural language using SnapGPT. Learn more.
- Suggestion icon (
): Populates a list of values dynamically based on your Account configuration.
- Upload
: Uploads files. Learn more.
Field / field set | Type | Description |
---|---|---|
Label | String |
Required. Specify a unique name for the Snap. Modify this to be more appropriate, especially if more than one of the same Snaps is in the pipeline. Default value: Extract Example: Extract |
Extract Options | Extract options enable you to configure the elements to extract from the PDF file. | |
Text | Checkbox/Expression | Select this checkbox to extract the text element. Default status: Selected |
Full table | Checkbox/Expression | Select this checkbox to extract the table elements. Default status: Deselected |
Figure | Checkbox/Expression | Select this checkbox to extract PNG files from the PDF. Default status: Deselected |
Advanced Extract Options | Configure the additional elements to extract from the PDF file. | |
Add character info | Checkbox/Expression | Select this checkbox to add character-level bounding boxes to the output. Default status: Deselected |
Get styling info | Checkbox/Expression | Select this checkbox to add styling information to the output. Default status: Deselected |
Snap execution | Dropdown list |
Select one of the three modes in which the Snap executes.
Available options are:
Default value: Validate & Execute Example: Execute only |
Troubleshooting
Failed to process the JSON output.
Invalid configuration.
Verify the settings and try again.
Failed to process request.
The usage limit has been reached.
Address the usage limit and try again
An error occurred while attempting to connect to Adobe Services.
Either the Client ID or Client secret is incorrect.
Verify the account settings are valid and try again.