Manage Azure Databricks Git folders using Terraform

You can manage Azure Databricks Git folders in a fully automated environment using Terraform and the databricks_repo Terraform resource.

This topic covers two authentication approaches:

  • Personal Access Token (PAT) authentication: Uses Git personal access tokens for repository access
  • Service principal with federated credentials: Uses Azure service principals with OpenID Connect (OIDC) tokens for secure, token-less authentication to Azure DevOps repositories

Authentication with personal access tokens

This approach uses Git personal access tokens for repository authentication.

In your Terraform configuration (.tf) file, set databricks_repo to the URL of the Git repository that you'll use for your Git folders:

resource "databricks_repo" "this" {
  url = "https://github.com/user/demo.git"
}

To use an Azure Databricks service principal with personal access token-based Git credentials, follow these steps:

Step 1: Configure the Azure Databricks provider

Set the provider databricks to the URL of your Azure Databricks workspace. You will define the access token databricks_obo_token in a later step.

provider "databricks" {
  # Configuration options
}

# Example 'databricks' provider configuration
provider "databricks" {
  alias = "sp"
  host = "https://....cloud.databricks.com"
  token = databricks_obo_token.this.token_value
}

Step 2: Create the service principal

Define the resources for the Azure Databricks service principal. You can find the service principal name in the Azure Databricks account console under User management > Service principals.

resource "databricks_service_principal" "sp" {
  display_name = "<service_principal_name_here>"
}

Step 3: Create the authorization token

Set the authorization token for your Azure Databricks service principal account using the application ID. You can find the service principal's application ID in the Azure Databricks account console under User management > Service principals.

resource "databricks_obo_token" "this" {
  application_id   = databricks_service_principal.sp.application_id
  comment          = "PAT on behalf of ${databricks_service_principal.sp.display_name}"
  lifetime_seconds = 3600
}

Step 4: Configure Git credentials

Set the Git credentials that the service principal will use to access your Git repository.

resource "databricks_git_credential" "sp" {
  provider = databricks.sp
  depends_on = [databricks_obo_token.this]
  git_username          = "<the_git_user_account_used_by_the_servcie_principal>"
  git_provider          = "<your_git_provider_string here>"
  personal_access_token = "<auth_token_string_for_git_user>"
}

Authentication with a service principal and federated credentials

For Azure DevOps repositories, you can use federated identity credentials to authenticate without storing long-lived secrets. This approach uses an Azure Databricks service principal with an OIDC token issued by Azure DevOps pipelines, which eliminates the need for personal access tokens.

Prerequisites

Before you can set up federated identity authentication for Azure Databricks Git folders, configure the following components:

Step 1: Configure variables

Specify values for the following variables in a terraform.tfvars file:

  • databricks_host: The URL of your Azure Databricks workspace, for example https://adb-123417477717.17.databricks.azure.cn
  • entra_client_id: The client ID of the Azure service principal
  • entra_client_secret: The client secret for the Azure service principal
  • entra_tenant_id: The Microsoft Entra ID where the service principal is registered
  • ado_repo_url: The HTTPS URL of the Git repository in Azure DevOps

Step 2: Configure the Azure Databricks provider

In your Terraform configuration, use the official databricks provider. Authentication for the provider can use your organization's standard method, such as environment variables in continuous integration (CI) or a service principal when running Terraform from a secure workstation.

terraform {
  required_providers {
    databricks = {
      source  = "databricks/databricks"
      version = "1.78.0"
    }
  }
}

provider "databricks" {
  host                = var.databricks_host
  azure_client_id     = var.entra_client_id
  azure_client_secret = var.entra_client_secret
  azure_tenant_id     = var.entra_tenant_id
}

Step 3: Create a federated Git credential for Azure DevOps

This credential tells Azure Databricks to use Azure Active Directory-backed federation for Azure DevOps.

resource "databricks_git_credential" "sp_ado" {
  git_provider            = "azureDevOpsServicesAad"
  is_default_for_provider = true
}

Step 4: Point a Git folder at your Azure DevOps repository

Create or update the Git folder to use your Azure DevOps repository URL.

resource "databricks_repo" "this" {
  url        = var.ado_repo_url
  depends_on = [databricks_git_credential.sp_ado]
}