Configure data access for ingestion

This article describes how admin users can configure access to data in a container in Azure Data Lake Storage Gen2 (ADLS Gen2) so that Azure Databricks users can load data from ADLS Gen2 into a table in Azure Databricks.

This article describes the following ways to configure secure access to source data:

  • (Recommended) Create a Unity Catalog volume.

  • Create a Unity Catalog external location with a storage credential.

  • Launch a compute resource that uses a service principal.

  • Generate temporary credentials (a Blob SAS token).

Before you begin

Before you configure access to data in ADLS Gen2, make sure you have the following:

  • Data in a container in your Azure storage account. To create a container, see Create a container in the Azure storage documentation.

  • To access data using a Unity Catalog volume (recommended), the READ VOLUME privilege on the volume. For more information, see What are Unity Catalog volumes? and Unity Catalog privileges and securable objects.

  • To access data using a Unity Catalog external location, the READ FILES privilege on the external location. For more information, see Create an external location to connect cloud storage to Azure Databricks

  • To access data using a compute resource with a service principal, Azure Databricks workspace admin permissions.

  • To access data using temporary credentials:

    • Azure Databricks workspace admin permissions.
    • Permissions in your Azure account to create Blob SAS tokens. This allows you to generate temporary credentials.
  • A Databricks SQL warehouse. To create a SQL warehouse, see Create a SQL warehouse.

  • Familiarity with the Databricks SQL user interface.

Configure access to cloud storage

Use one of the following methods to configure access to ADLS Gen2:

Clean up

You can clean up the associated resources in your cloud account and Azure Databricks if you no longer want to keep them.

Delete the ADLS Gen2 storage account

  1. Open the Azure portal for your Azure account, typically at https://portal.azure.cn.
  2. Browse to and open your storage account.
  3. Click Delete.
  4. Enter the name of the storage account, and then click Delete.

Stop the SQL warehouse

If you are not using the SQL warehouse for any other tasks, you should stop the SQL warehouse to avoid additional costs.

  1. In the SQL persona, on the sidebar, click SQL Warehouses.
  2. Next to the name of the SQL warehouse, click Stop.
  3. When prompted, click Stop again.

Next steps

After you complete the steps in this article, users can run the COPY INTO command to load the data from the ADLS Gen2 container into your Azure Databricks workspace.