Azure Databricks architecture overview
This article provides a high-level overview of Azure Databricks architecture, including its enterprise architecture, in combination with Azure.
High-level architecture
Azure Databricks operates out of a control plane and a compute plane.
The control plane includes the backend services that Azure Databricks manages in your Azure Databricks account. The web application is in the control plane.
The compute plane is where your data is processed.
- For classic Azure Databricks compute, the compute resources are in your Azure subscription in what is called the classic compute plane. This refers to the network in your Azure subscription and its resources.
Each Azure Databricks workspace has an associated storage account known as the workspace storage account. The workspace storage account is in your Azure subscription.
The following diagram describes the overall Azure Databricks architecture.
Classic compute plane
In the classic compute plane, Azure Databricks compute resources run in your Azure subscription. New compute resources are created within each workspace's virtual network in the customer's Azure subscription.
A classic compute plane has natural isolation because it runs in each customer's own Azure subscription. To learn more about networking in the classic compute plane, see Classic compute plane networking.
For regional support, see Azure Databricks regions.
Workspace storage account
When you create a workspace, Azure Databricks creates a account in your Azure subscription to use as the workspace storage account.
The workspace storage account contains:
- Workspace system data: Workspace system data is generated as you use various Azure Databricks features such as creating notebooks. This bucket includes notebook revisions, job run details, command results, and Spark logs
- DBFS: DBFS (Databricks File System) is a distributed file system in Azure Databricks environments accessible under the
dbfs:/
namespace. DBFS root and DBFS mounts are both in thedbfs:/
namespace. Storing and accessing data using DBFS root or DBFS mounts is a deprecated pattern and not recommended by Databricks. For more information, see What is DBFS?. - Unity Catalog workspace catalog: If your workspace was enabled for Unity Catalog automatically, the workspace storage account contains the default workspace catalog. All users in your workspace can create assets in the default schema in this catalog. See Set up and manage Unity Catalog.