Use Databricks for synthetic data

To use Databricks as a data source or destination for your synthetic data, you need to create a Databricks connector.

Prerequisites

To create a Databricks connector, you need to obtain your SQL Warehouse connection details, a Databricks catalog name, and a personal access token for Databricks. The linked sections below provide step-by-step guidance on how to complete the prerequisites.

Get connection details for your Databricks SQL Warehouse

  1. In Databricks, open the workspace that contains the SQL Warehouse you want to use.
  2. Open the sidebar menu again and select SQL Warehouses.
  3. From the list, open the SQL warehouse you want to use for synthetic data.
  4. Select the Connection details tab.
  5. Copy the necessary connection details (hostname, port, protocol, and HTTP path) for the MOSTLY AI Databricks connector.

Get Databricks catalog name

  1. From the Databricks sidebar menu, select Data.
  2. Copy the name of the catalog you want to use in MOSTLY AI.

Create a Databricks personal access token

  1. In Databricks, open your account menu and select User Settings.

  2. Under Settings, select Developer.

  3. Click Manage for Access Tokens.

  4. Click Generate new token.

  5. In the Generate new token window, enter a name that identifies where you intend to use the token.

    💡
    Adjust the expiration of the token in the Lifetime (days) box.
  6. Click Generate.

  7. Copy the access token and save it in a secure location.

    ⚠️
    Before you close the window, save the token in a location you can access later.

Create a Databricks connector

Create a new Databricks connector from the Connectors page.

Steps

  1. From the Connectors page, select Databricks under the Connect your data header.
  2. On the New connector modal, configure the connector.
FieldDescription
NameEnter a name that you can distinguish from other connectors.
Access typeSelect whether you want to use the connector as a source or destination.
HostYour SQL warehouse server hostname.
HTTP pathYour SQL warehouse HTTP path.
Access tokenYour Databricks personal access token.
CatalogThe name of your Databricks catalog.
  1. Click Save to save your new Databricks connector.
    • MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.

    • You can click Save anyway to save the connector and disregard any errors.

Authenticate with a Service principal

To use a Service principal account to access original data stored in Databricks, you need to create a Databricks connector with a Service principal account.

The Databricks connector configuration includes configuration details that support the authentication with a Service principal account.

Steps

  1. To use a Service principal for authentication in your Databricks connector, select the Authenticate with Service Principal checkbox.
  2. Configure the Databricks connector.
FieldDescription
NameEnter a name that you can distinguish from other connectors.
Access typeSelect whether you want to use the connector as a source or destination.
HostYour SQL warehouse server hostname.
HTTP pathYour SQL warehouse HTTP path.
CatalogThe name of your Databricks catalog.
Tenant IDYour tenant ID.
Client IDYour client ID.
Client secretYour client secret.
  1. Click Save to save your new Databricks connector.
    • MOSTLY AI tests the connection. If you see an error, check the connection details, update them, and click Save again.

    • You can click Save anyway to save the connector disregarding any errors.

What’s next

Depending on whether you created a source or a destination connector, you can use the connector as: