User-defined route settings for Azure Databricks
If your Azure Databricks workspace is deployed to your own virtual network (VNet), you can use custom routes, also known as user-defined routes (UDR), to ensure that network traffic is routed correctly for your workspace. For example, if you connect the virtual network to your on-premises network, traffic may be routed through the on-premises network and unable to reach the Azure Databricks control plane. User-defined routes can solve that problem.
You need a UDR for every type of outbound connection from the VNet. You can use both Azure service tags and IP addresses to define network access controls on your user-defined routes. Databricks recommends using Azure service tags to prevent service outages due to IP changes.
Configure user-defined routes with Azure service tags
Databricks recommends that you use Azure service tags, which represent a group of IP address prefixes from a given Azure service. Azure manages the address prefixes encompassed by the service tag and automatically updates the service tag as addresses change. This helps to prevent service outages due to IP changes and removes the need to periodically look up these IPs and update them in your route table. However, if your organization policies disallow service tags, you can optionally specify the routes as IP addresses.
Using service tags, your user-defined routes should use the following rules and associate the route table to your virtual network's public and private subnets.
Source | Address prefix | Next hop type |
---|---|---|
Default | Azure Databricks service tag | Internet |
Default | Azure SQL service tag | Internet |
Default | Azure Storage service tag | Internet |
Default | Azure Event Hubs service tag | Internet |
Note
You can choose to add the Microsoft Entra ID service tag to facilitate Microsoft Entra ID authentication from Azure Databricks clusters to Azure resources.
If Azure Private Link is enabled on your workspace, the Azure Databricks service tag is not required.
The Azure Databricks service tag represents IP addresses for the required outbound connections to the Azure Databricks control plane, the secure cluster connectivity (SCC), and the Azure Databricks web application.
The Azure SQL service tag represents IP addresses for the required outbound connections to the Azure Databricks metastore, and the Azure Storage service tag represents IP addresses for artifact Blob storage and log Blob storage. The Azure Event Hubs service tag represents the required outbound connections for logging to Azure Event Hub.
Some service tags allow more granular control by restricting IP ranges to a specified region. For example, a route table for an Azure Databricks workspace in the China East 2
regions might look like:
Name | Address prefix | Next hop type |
---|---|---|
adb-servicetag | AzureDatabricks | Internet |
adb-metastore | Sql.ChinaEast2 | Internet |
adb-storage | Storage.ChinaEast2 | Internet |
adb-eventhub | EventHub.ChinaEast2 | Internet |
To get the service tags required for user-defined routes, see Virtual network service tags.
Configure user-defined routes with IP addresses
Databricks recommends that you use Azure service tags, but if your organization policies don't allow service tags, you can use IP addresses to define network access controls on your user-defined routes.
The details vary based on whether secure cluster connectivity (SCC) is enabled for the workspace:
- If secure cluster connectivity is enabled for the workspace, you need a UDR to allow the clusters to connect to the secure cluster connectivity relay in the control plane. Be sure to include the systems marked as SCC relay IP for your region.
- If secure cluster connectivity is disabled for the workspace, there is an inbound connection from the Control Plane NAT, but the low-level TCP SYN-ACK to that connection technically is outbound data that requires a UDR. Be sure to include the systems marked as Control Plane NAT IP for your region.
Your user-defined routes should use the following rules and associate the route table to your virtual network's public and private subnets.
Source | Address prefix | Next hop type |
---|---|---|
Default | Control plane NAT IP (if SCC is disabled) | Internet |
Default | SCC relay IP (if SCC is enabled) | Internet |
Default | Webapp IP | Internet |
Default | Metastore IP | Internet |
Default | Artifact Blob storage IP | Internet |
Default | Log Blob storage IP | Internet |
Default | Workspace storage IP - Blob Storage endpoint | Internet |
Default | Workspace storage IP - ADLS gen2 (dfs ) endpoint |
Internet |
Default | Event Hubs IP | Internet |
If Azure Private Link is enabled on your workspace, your user-defined routes should use the following rules and associate the route table to your virtual network's public and private subnets.
Source | Address prefix | Next hop type |
---|---|---|
Default | Metastore IP | Internet |
Default | Artifact Blob storage IP | Internet |
Default | Log Blob storage IP | Internet |
Default | Event Hubs IP | Internet |
To get the IP addresses required for user-defined routes, use the tables and instructions in Azure Databricks regions, specifically: