Databricks Snap Pack

Overview

Databricks is a data analytics platform optimized for the AWS (Amazon Web Services) cloud platform. It can also be run on Amazon AWS cloud and Google Cloud Platform. Databricks offers three environments for developing data-intensive applications: Databricks SQL, Databricks Data Science & Engineering, and Databricks Machine Learning. Learn more: Databricks Documentation.

This Snap Pack focuses on the Databricks Data Science & Engineering environment which is also referred to as Databricks Lakehouse Platform (DLP) or Databricks. The Databricks Snap Pack contains the following Snaps:

Databricks - Select : Retrieves information from the target Databricks table.
Databricks - Insert : Inserts new rows of data in the target Databricks table.
Databricks - Delete : Deletes data from a target Databricks table.
Databricks - Bulk Load : Loads millions of rows of data in the target table through a single load operation.
Databricks - Unload : Unloads data from a target Databricks table through a single unload operation.
Databricks - Merge Into : Updates millions of existing rows and inserts new rows in a target Databricks table through a single operation.
Databricks - Run Job : Automates the execution of a set of tasks or processes within a Databricks workspace.
Databricks - Execute : Runs multiple SQL statements on the target Databricks instance.

Limitations

With the basic authentication type for Databricks Lakehouse Platform (DLP) reaching its end of life on July 10, 2024, SnapLogic Databricks pipelines designed to use this authentication type to connect to DLP instances would cease to succeed. We recommend that you reconfigure the corresponding Snap accounts to use the Personal access tokens (PAT) authentication type.

Prerequisites

Create and configure an application corresponding to your Databricks instance in the AWS, Azure Portal, or Google Cloud Platform before using these Snaps. All the Snaps' accounts require information pertaining to this application for authentication purposes.

Supported versions

This Snap Pack is tested against the databricks-jdbc-2.6.40 driver version.
The JDBC driver version 2.6.36 and later supports Machine-to-Machine (M2M) authentication for a Databricks service principal.

Note: We recommend that you use the default JAR file version (databricks-jdbc-2.6.40) in your pipelines. However, you may choose to use a custom JAR file version.