Configure your Databricks project using the Databricks extension for Visual Studio Code
The Databricks extension for Visual Studio Code offers a Configuration view within the extension panel that allows you to easily configure and update settings for your Databricks project. These features include a target workspace deployment selector, easy configuration of authentication and compute, workspace folder sync, and simple steps for activating the Python virtual environment necessary for debugging.
The Configuration view within the Databricks extension for Visual Studio Code is available once you have created or migrated a project to a Databricks project. See Create a new Databricks project.
Note
Earlier versions of the Databricks extension for Visual Studio Code defined configuration settings in a project JSON file and environment variables were set in the terminal. In the release version, project and environment configuration is found in the databricks.yml
and databricks.env
files.
If your project is a Databricks Asset Bundle, the Databricks extention UI also provides a Bundle Resource Explorer and a Bundle Variables View to manage your bundle resources and variables. See Databricks Asset Bundles extension features.
Change the target deployment workspace
To select or switch the deployment target for your Databricks project (for example, to switch from a dev
target to a prod
target):
In the Configuration view of the Databricks extension panel, click the gear icon (Select a Databricks Asset Bundle target) associated with Target.
In the Command Palette, select the desired deployment target.
Once a target is configured, the Host and deployment Mode is displayed. For information about Databricks Asset Bundles deployment modes, see Databricks Asset Bundle deployment modes.
The workspace host can be changed by modifying the target workspace
setting in the databricks.yml
configuration file associated with the project. See targets.
Note
The following Databricks extension for Visual Studio Code features are only available when the target deployment mode is development:
- Use the attached development cluster for bundle jobs
- Sync workspace folder files
- Select an interactive development cluster
Configure the Databricks profile for the project
When you create a Databricks project or migrate a project to be a Databricks project, you configure a profile that includes authentication settings used to connect to Databricks. If you want to change the authentication profile used, click the gear icon associated with AuthType in the Configuration view.
For more information Databricks extension for Visual Studio Code authentication, see Authentication setup for the Databricks extension for Visual Studio Code.
Select a cluster for running code and jobs
Using the Databricks extension for Visual Studio Code, you can select an existing Azure Databricks cluster or create a new Azure Databricks cluster for running your code and jobs. Once you have connected to compute, the cluster's ID, Databricks Runtime version, creator, state, and access mode are displayed. You can also start and stop the cluster, and navigate directly to the cluster's page details.
Tip
If you don't want to wait for the jobs cluster to start up, check Override Jobs cluster in bundle just below the cluster selection to use the selected cluster for running bundle jobs in development mode.
Use an existing cluster
If you have an existing Azure Databricks cluster that you want to use:
In the Configuration view, next to Cluster, click Select a cluster or the gear (Configure cluster) icon.
In the Command Palette, select the cluster that you want to use.
Create a new cluster
If you do not have an existing Azure Databricks cluster, or you want to create a new one:
In the Configuration view, next to Cluster, click the gear (Configure cluster) icon.
In the Command Palette, click Create New Cluster.
When prompted to open the external website (your Azure Databricks workspace), click Open.
If prompted, sign in to your Azure Databricks workspace.
Follow the instructions to create a cluster.
Note
Databricks recommends that you create a Personal Compute cluster. This enables you to start running workloads immediately, minimizing compute management overhead.
After the cluster is created and is running, go back to Visual Studio Code.
In the Configuration view, next to Cluster, click the gear (Configure cluster) icon.
In the Command Palette, click the cluster that you want to use.
Sync your workspace folder with Databricks
You can sync the remote Databricks workspace folder associated with your Databricks project by clicking the sync icon (Start synchronization) associated with Workspace Folder in the Configuration view of the Databricks extension panel.
Note
The Databricks extension for Visual Studio Code works only with workspace directories that it creates. You cannot use an existing workspace directory in your project unless it was created by the extension.
To navigate to the workspace view in Databricks, click the external link icon (Open link externally) icon associated with Workspace Folder.
The extension determines the Azure Databricks workspace folder to use based on the file_path
setting in the workspace
mapping of the project's associated Databricks Asset Bundle configuration. See workspace.
Note
The Databricks extension for Visual Studio Code only performs one-way, automatic synchronization of file changes from your local Visual Studio Code project to the related workspace folder in your remote Azure Databricks workspace. The files in this remote workspace directory are intended to be transient. Do not initiate changes to these files from within your remote workspace, as these changes will not be synchronized back to your local project.
For usage details on the workspace directory sync feature for earlier versions of the Databricks extension for Visual Studio Code, see Select a workspace directory for the Databricks extension for Visual Studio Code.
Set up your Python environment and Databricks Connect
The Python Environment section of the Configuration view enables easy setup of your Python virtual development environment and installation of Databricks Connect for running and debugging code and notebook cells. Python virtual environments make sure that your project is using compatible versions of Python and Python packages (in this case, the Databricks Connect package).
To configure the Python virtual environment for your project, in the Configuration view of the extension panel:
- Click the red Activate Virtual Environment item under Python Environment.
- In the Command Palette, select Venv or Conda.
- Select the dependencies you want to install, if any.
To change environments, click the gear icon (Change virtual environment) associated with Active Environment.
For information about installing Databricks Connect, which enables running and debugging code and notebooks within Visual Studio Code, see Debug code using Databricks Connect for the Databricks extension for Visual Studio Code.