Stack CLI (legacy)
Important
This documentation has been retired and might not be updated.
This information applies to legacy Databricks CLI versions 0.18 and below. Databricks recommends that you use newer Databricks CLI version 0.205 or above instead. See What is the Databricks CLI?. To find your version of the Databricks CLI, run databricks -v
.
To migrate from Databricks CLI version 0.18 or below to Databricks CLI version 0.205 or above, see Databricks CLI migration.
Databricks CLI versions 0.205 and above do not support the stack CLI. Databricks recommends that you use the Databricks Terraform provider instead.
Note
The stack CLI requires Databricks CLI 0.8.3 or above.
The stack CLI provides a way to manage a stack of Azure Databricks resources, such as jobs, notebooks, and DBFS files. You can store notebooks and DBFS files locally and create a stack configuration JSON template that defines mappings from your local files to paths in your Azure Databricks workspace, along with configurations of jobs that run the notebooks.
Use the stack CLI with the stack configuration JSON template to deploy and manage your stack.
You run Databricks stack CLI subcommands by appending them to databricks stack
.
databricks stack --help
Usage: databricks stack [OPTIONS] COMMAND [ARGS]...
[Beta] Utility to deploy and download Databricks resource stacks.
Options:
-v, --version [VERSION]
--debug Debug Mode. Shows full stack trace on error.
--profile TEXT CLI connection profile to use. The default profile is
"DEFAULT".
-h, --help Show this message and exit.
Commands:
deploy Deploy a stack of resources given a JSON configuration of the stack
Usage: databricks stack deploy [OPTIONS] CONFIG_PATH
Options:
-o, --overwrite Include to overwrite existing workspace notebooks and DBFS
files [default: False]
download Download workspace notebooks of a stack to the local filesystem
given a JSON stack configuration template.
Usage: databricks stack download [OPTIONS] CONFIG_PATH
Options:
-o, --overwrite Include to overwrite existing workspace notebooks in the
local filesystem [default: False]
Deploy a stack to a workspace
This subcommand deploys a stack. See Stack setup to learn how to set up a stack.
databricks stack deploy ./config.json
Stack configuration JSON template gives an example of config.json
.
Download stack notebook changes
This subcommand downloads the notebooks of a stack.
databricks stack download ./config.json
Examples
Stack setup
File structure of an example stack
tree
.
├── notebooks
| ├── common
| | └── notebook.scala
| └── config
| ├── environment.scala
| └── setup.sql
├── lib
| └── library.jar
└── config.json
This example stack contains a main notebook in notebooks/common/notebook.scala
along with configuration
notebooks in the notebooks/config
folder. There is a JAR library dependency of the stack
in lib/library.jar
. config.json
is the stack configuration JSON template of the stack. This is
what is passed into the stack CLI for deployment of the stack.
Stack configuration JSON template
The stack configuration template describes the stack configuration.
cat config.json
{
"name": "example-stack",
"resources": [
{
"id": "example-workspace-notebook",
"service": "workspace",
"properties": {
"source_path": "notebooks/common/notebook.scala",
"path": "/Users/example@example.com/dev/notebook",
"object_type": "NOTEBOOK"
}
},
{
"id": "example-workspace-config-dir",
"service": "workspace",
"properties": {
"source_path": "notebooks/config",
"path": "/Users/example@example.com/dev/config",
"object_type": "DIRECTORY"
}
},
{
"id": "example-dbfs-library",
"service": "dbfs",
"properties": {
"source_path": "lib/library.jar",
"path": "dbfs:/tmp/lib/library.jar",
"is_dir": false
}
},
{
"id": "example-job",
"service": "jobs",
"properties": {
"name": "Example Stack CLI Job",
"new_cluster": {
"spark_version": "7.3.x-scala2.12",
"node_type_id": "Standard_DS3_v2",
"num_workers": 3
},
"timeout_seconds": 7200,
"max_retries": 1,
"notebook_task": {
"notebook_path": "/Users/example@example.com/dev/notebook"
},
"libraries": [
{
"jar": "dbfs:/tmp/lib/library.jar"
}
]
}
}
]
}
Each job, workspace notebook, workspace directory, DBFS file, or DBFS directory
is defined as a ResourceConfig. Each ResourceConfig
that represent a workspace or DBFS asset contains
a mapping from the file or directory where it exists locally (source_path
) to where it would exist in the workspace or DBFS (path
).
Stack configuration template schema outlines the schema for the stack configuration template.
Deploy a stack
You deploy a stack using the databricks stack deploy <configuration-file>
command.
databricks stack deploy ./config.json
During stack deployment, the DBFS and workspace assets are uploaded to your Azure Databricks workspace and jobs are created.
At stack deploy time, a StackStatus JSON file for the deployment is saved
in the same directory as the stack configuration template with the name, adding
deployed
immediately before the .json
extension: (for example, ./config.deployed.json
). This
file is used by the Stack CLI to keep track of past deployed resources on your workspace.
Stack status schema outlines the schema of a stack configuration.
Important
Do not attempt to edit or move the stack status file. If you get any errors regarding the stack status file, delete the file and try the deployment again.
./config.deployed.json
{
"cli_version": "0.8.3",
"deployed_output": [
{
"id": "example-workspace-notebook",
"databricks_id": {
"path": "/Users/example@example.com/dev/notebook"
},
"service": "workspace"
},
{
"id": "example-workspace-config-dir",
"databricks_id": {
"path": "/Users/example@example.com/dev/config"
},
"service": "workspace"
},
{
"id": "example-dbfs-library",
"databricks_id": {
"path": "dbfs:/tmp/lib/library.jar"
},
"service": "dbfs"
},
{
"id": "example-job",
"databricks_id": {
"job_id": 123456
},
"service": "jobs"
}
],
"name": "example-stack"
}
Data structures
In this section:
Stack configuration template schema
StackConfig
These are the outer fields of a stack configuration template. All fields are required.
Field Name | Type | Description |
---|---|---|
name | STRING |
The name of the stack. |
resources | List of ResourceConfig | An asset in Azure Databricks. Resources are related to three services (REST API namespaces): workspace, jobs, and dbfs. |
ResourceConfig
The fields for each ResourceConfig
. All fields are required.
Field Name | Type | Description |
---|---|---|
id | STRING |
A unique ID for the resource. Uniqueness of ResourceConfig is enforced. |
service | ResourceService | The REST API service that the resource operates on. One of: jobs ,workspace , or dbfs . |
properties | ResourceProperties | Fields in this are different depending the the ResourceConfig service. |
ResourceProperties
The properties of a resource by ResourceService. The fields are classified as those used or not used in an Azure Databricks REST API. All the fields listed are required.
service | Fields from the REST API used in the Stack CLI | Fields used only in the Stack CLI |
---|---|---|
workspace | path: STRING - Remote workspace paths of notebooks or directories. (Ex. /Users/example@example.com/notebook )object_type: Workspace API- Notebook object type. Can only be NOTEBOOK or DIRECTORY . |
source_path: STRING - Local source path of workspace notebooks or directories. A relative path to the stack configuration template file or an absolute path in your filesystem. |
jobs | Any field in the settings or new_settings structure. The only field not required in the settings or new_settings structure but required for the stack CLI is: name: STRING - Name of the job to be deployed. For purposes of not creating too many duplicate jobs, the Stack CLI enforces unique names in stack deployed jobs. |
None. |
dbfs | path: STRING - Matching remote DBFS path. Must start with dbfs:/ . (ex. dbfs:/this/is/a/sample/path )is_dir: BOOL - Whether a DBFS path is a directory or a file. |
source_path: STRING - Local source path of DBFS files or directories. A relative path to the stack config template file or an absolute path in your filesystem. |
ResourceService
Each resource belongs to a specific service that aligns with the Databricks REST API. These are the services that are supported by the Stack CLI.
Service | Description |
---|---|
workspace | A workspace notebook or directory. |
jobs | An Azure Databricks job. |
dbfs | A DBFS file or directory. |
Stack status schema
StackStatus
A stack status file is created after a stack is deployed using the CLI. The top-level fields are:
Field Name | Type | Description |
---|---|---|
name | STRING |
The name of the stack. This field is the same field as in StackConfig. |
cli_version | STRING |
The version of the Databricks CLI used to deploy the stack. |
deployed_resources | List of ResourceStatus | The status of each deployed resource. For each resource defined in StackConfig, a corresponding ResourceStatus is generated here. |
ResourceStatus
Field Name | Type | Description |
---|---|---|
id | STRING |
A stack-unique ID for the resource. |
service | ResourceService | The REST API service that the resource operates on. One of: jobs ,workspace , or dbfs . |
databricks_id | DatabricksId | The physical ID of the deployed resource. The actual schema depends on the type (service) of the resource. |
DatabricksId
A JSON object whose field depends on the service.
Service | Field in JSON | Type | Description |
---|---|---|---|
workspace | path | STRING | The absolute path of the notebook or directory in an Azure Databricks workspace. Naming is consistent with the Workspace API. |
jobs | job_id | STRING | The job ID as shown in an Azure Databricks workspace. This can be used to update jobs already deployed. |
dbfs | path | STRING | The absolute path of the notebook or directory in an Azure Databricks workspace. Naming is consistent with the DBFS API. |