sync
command group
Note
This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.
Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.
The sync
command group within the Databricks CLI enables one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace.
Note
sync
commands cannot synchronize file changes from a directory within a remote Azure Databricks workspace, back to a directory within a local filesystem.sync
commands can synchronize file changes from a local development machine only to workspace user (/Users
) files in your Azure Databricks workspace. It cannot synchronize to DBFS (dbfs:/
) files. To synchronize file changes from a local development machine to DBFS (dbfs:/
) in your Azure Databricks workspace, use the dbx sync utility.
You run sync
commands by appending them to databricks sync
. To display help for the sync
command, run databricks sync -h
.
Incrementally sync local file changes to a remote directory
To perform a single, incremental, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace, run the sync
command, as follows:
databricks sync <local-directory-path> <remote-directory-path>
For example, to do a one-time, one-way, incremental synchronization of all file changes in the folder named my-folder
within the local current working directory, to a specific path within the remote workspace, run the following command:
databricks sync ./my-folder/ /Users/someone@example.com/
In this example, only file changes since the last run of the sync
command are synchronized to /Users/someone@example.com/
. By default, the workspace URL within the caller's DEFAULT
profile is used to determine the remote workspace to sync to.
Fully sync local file changes to a remote directory
To perform a single, full, one-way synchronization of file changes within a local filesystem directory to a directory within a remote Azure Databricks workspace, regardless of when the last sync
command was run, use the --full
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --full
Continuously sync local file changes to a remote directory
To turn on continuous, one-way synchronization of file changes within a local filesystem directory, to a directory within a remote Azure Databricks workspace, use the --watch
option, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch
One-way synchronization continues until the command is stopped from the terminal, typically by pressing Ctrl + c
or Ctrl + z
.
Polling for possible synchronization events happens once per second by default. To change this interval, use the --interval
option along with the number of seconds to poll followed by the character s
, for example for five seconds:
databricks sync ./my-folder/ /Users/someone@example.com/ --watch --interval 5s
Change the sync progress output format
Sync progress information is output to the terminal in text format by default. To specify the sync progress output format, use the --output
option, specifying either text
(the default, if --output
is not otherwise specified) or json
, for example:
databricks sync ./my-folder/ /Users/someone@example.com/ --output json