BigQuery Table Create

Use this Snap to create Google BigQuery tables that support clustering and partitioning.

Overview

You can use this Snap to create Google BigQuery tables that support clustering and partitioning.



Prerequisites

A valid Google BigQuery account with the required permissions.

Limitations and Known Issues

  • The rounding_mode option for BIGNUMERIC and NUMERIC data type fields is not supported. These data type fields fields can be configured on the Google Cloud console.
  • This Snap considers only the following JSON keys in the schema during table creation (any other keys given are ignored):
    • name
    • type
    • mode
    • description
    • fields

In the following example, the Snap creates the table with the with name, type, and mode keys but ignores defaultValueExpression. There is no error displayed for unsupported keys.

[{

"name": "sales",

"type": "FLOAT",

"mode": "NULLABLE",

"defaultValueExpression": "2.55"

}]

Supported Accounts

This Snap works with the following account types. For more information, see Configuring Google BigQuery Accounts.

Snap Views

Type Format Number of Views Examples of Upstream and Downstream Snaps Description
Input Document
  • Min: 0
  • Max: 1
  • Mapper
  • JSON Generator

The Project ID and the Document ID.

Output Document
  • Min: 1
  • Max: 1
  • JSON Parser
  • File Writer

The list of Table IDs along with their Project IDs, Dataset IDs, and Table type.

Learn more about Error handling.

Snap Settings

Note: Learn about the common controls in the Snap settings dialog.
Field/Field set Description
Label

String

Specify a unique name for the Snap. Modify this to be more specific, especially if you have more than one Snap of the same type in your pipeline.

Default value: BigQuery Table Create

Example: Collegenames Table Create

Project ID

String/Expression

Specify the project ID in which the dataset resides.

Default value: N/A

Example: test-project-12345

Dataset ID

String/Expression

Specify the dataset ID of the destination.

Default value: N/A

Example: dataset-12345

Table ID

String/Expression

Specify the table ID of the table you are creating. This is a unique ID that you must provide (it is not automatically created or assigned).

Learn more about creating valid BigQuery table names.

Default value: N/A

Example: table-12345

Table Schema (JSON)

String

Enter the JSON schema for the table.

Detailed Information

Checkbox

Select this checkbox to enable the Snap to access additional fields for displaying them in the output.

Default value: Deselected

Default table expiration (in days)

String/Expression

New tables created in this dataset will be automatically deleted in the number of days specified.

Appears when you select the Detailed Information checkbox.

Partitioning

Use this field set to define partitioning requirements.

Enable partitioning

Checkbox

Select to configure partitioning.

Default value: Deselected

Appears when you select the Partitioning dropdown.

Require partitioning filter

Checkbox

Select to require users to include a WHERE clause that specifies the partitions to query each time they query the table.

Select the partitioning type (time or range) and the partitioning time (time interval after which a new partition is created).

Default value: Deselected

Appears when you select the Enable Partitioning checkbox.

Clustering

Use this field set to define clustering requirements.

Enable clustering

Checkbox

Select to configure clustering.

Default value: Deselected

Appears when you select the Clustering dropdown.

Snap execution

Dropdown list
Choose one of the three modes in which the Snap executes. Available options are:
  • Validate & Execute. Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.
  • Execute only. Performs full execution of the Snap during pipeline execution without generating preview data.
  • Disabled. Disables the Snap and all Snaps that are downstream from it.

Default value: Validate & Execute

Example: Disabled

Implicit retries in BigQuery Snaps

The BigQuery Snaps handle all retriable BigQuery errors (BigQuery exception, IO exception, and Runtime exception) internally.

  • 429 (Too Many Requests):
    • Retry attempts: Maximum of 5 retries.
    • Delay Between Retries: Backoff strategy with jitter (random variation) is applied to prevent synchronized retries and reduce load.
  • 401 (Unauthorized):
    • Retry attempts: Maximum of 3 retries.
    • Delay Between Retries: Backoff strategy is applied.
    • Additional Actions: Reloads the BigQuery account on the retry event.
  • IOException and 500, 502, 503, 504 (Server Errors):
    • Retry attempts: Maximum of 3 retries.
    • Delay Between Retries: Backoff strategy is applied.