群集 CLI(旧版)
重要
本文档已过时,将来可能不会更新。
此信息适用于旧版 Databricks CLI 0.18 及更低版本。 Databricks 建议改用较新的 Databricks CLI 0.205 或更高版本。 请参阅什么是 Databricks CLI?。 若要查找你的 Databricks CLI 的版本,请运行 databricks -v
。
若要从 Databricks CLI 版本 0.18 或更低版本迁移到 Databricks CLI 版本 0.205 或更高版本,请参阅 Databricks CLI 迁移。
可以通过将 Databricks 群集 CLI 子命令追加到 databricks clusters
后面来运行这些命令。 这些子命令调用群集 API。
databricks clusters -h
Usage: databricks clusters [OPTIONS] COMMAND [ARGS]...
Utility to interact with Databricks clusters.
Options:
-v, --version [VERSION]
-h, --help Show this message and exit.
Commands:
create Creates a Databricks cluster.
Options:
--json-file PATH File containing JSON request to POST to /api/2.0/clusters/create.
--json JSON JSON string to POST to /api/2.0/clusters/create.
delete Removes a Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
edit Edits a Databricks cluster.
Options:
--json-file PATH File containing JSON request to POST to /api/2.0/clusters/edit.
--json JSON JSON string to POST to /api/2.0/clusters/edit.
events Gets events for a Spark cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/#/setting/clusters/$CLUSTER_ID/configuration. [required]
--start-time TEXT The start time in epoch milliseconds. If
unprovided, returns events starting from the
beginning of time.
--end-time TEXT The end time in epoch milliseconds. If unprovided,
returns events up to the current time
--order TEXT The order to list events in; either ASC or DESC.
Defaults to DESC (most recent first).
--event-type TEXT An event types to filter on (specify multiple event
types by passing the --event-type option multiple
times). If empty, all event types are returned.
--offset TEXT The offset in the result set. Defaults to 0 (no
offset). When an offset is specified and the
results are requested in descending order, the
end_time field is required.
--limit TEXT The maximum number of events to include in a page
of events. Defaults to 50, and maximum allowed
value is 500.
--output FORMAT can be "JSON" or "TABLE". Set to TABLE by default.
get Retrieves metadata about a cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
list Lists active and recently terminated clusters.
Options:
--output FORMAT JSON or TABLE. Set to TABLE by default.
list-node-types Lists node types for a cluster.
list-zones Lists zones where clusters can be created.
permanent-delete Permanently deletes a cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
resize Resizes a Databricks cluster given its ID.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
--num-workers INTEGER Number of workers. [required]
restart Restarts a Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
spark-versions Lists possible Databricks Runtime versions.
start Starts a terminated Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
创建群集
若要显示使用情况文档,请运行 databricks clusters create --help
。
databricks clusters create --json-file create-cluster.json
create-cluster.json
:
{
"cluster_name": "my-cluster",
"spark_version": "7.3.x-scala2.12",
"node_type_id": "Standard_D3_v2",
"spark_conf": {
"spark.speculation": true
},
"num_workers": 25
}
{
"cluster_id": "1234-567890-batch123"
}
删除群集
若要显示使用情况文档,请运行 databricks clusters delete --help
。
databricks clusters delete --cluster-id 1234-567890-batch123
如果成功,则不显示任何输出。
更改群集的配置
若要显示使用情况文档,请运行 databricks clusters edit --help
。
databricks clusters edit --json-file edit-cluster.json
edit-cluster.json
:
{
"cluster_id": "1234-567890-batch123",
"num_workers": 10,
"spark_version": "7.3.x-scala2.12",
"node_type_id": "Standard_D3_v2"
}
如果成功,则不显示任何输出。
列出群集的事件
若要显示使用情况文档,请运行 databricks clusters events --help
。
databricks clusters events \
--cluster-id 1234-567890-batch123 \
--start-time 1617238800000 \
--end-time 1619485200000 \
--order DESC \
--limit 5 \
--event-type RUNNING \
--output JSON \
| jq .
{
"events": [
{
"cluster_id": "1234-567890-batch123",
"timestamp": 1619214150232,
"type": "RUNNING",
"details": {
"current_num_workers": 2,
"target_num_workers": 2
}
},
...
{
"cluster_id": "1234-567890-batch123",
"timestamp": 1617895221986,
"type": "RUNNING",
"details": {
"current_num_workers": 2,
"target_num_workers": 2
}
}
],
"next_page": {
"cluster_id": "1234-567890-batch123",
"start_time": 1617238800000,
"end_time": 1619485200000,
"order": "DESC",
"event_types": [
"RUNNING"
],
"offset": 5,
"limit": 5
},
"total_count": 11
}
获取有关群集的信息
若要显示使用情况文档,请运行 databricks clusters get --help
。
databricks clusters get --cluster-id 1234-567890-batch123
或:
databricks clusters get --cluster-name my-cluster
{
"cluster_id": "1234-567890-batch123",
"spark_context_id": 3124308392469747564,
"cluster_name": "my-cluster",
"spark_version": "7.5.x-scala2.12",
"spark_conf": {
"spark.databricks.delta.preview.enabled": "true"
},
"node_type_id": "Standard_DS3_v2",
"driver_node_type_id": "Standard_DS3_v2",
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},
"autotermination_minutes": 0,
"enable_elastic_disk": true,
"disk_spec": {},
"cluster_source": "JOB",
"enable_local_disk_encryption": false,
"azure_attributes": {
"first_on_demand": 1,
"availability": "ON_DEMAND_AZURE",
"spot_bid_max_price": -1.0
},
"instance_source": {
"node_type_id": "Standard_DS3_v2"
},
"driver_instance_source": {
"node_type_id": "Standard_DS3_v2"
},
"state": "TERMINATED",
"state_message": "",
"start_time": 1619563745373,
"terminated_time": 1619563822867,
"last_state_loss_time": 0,
"num_workers": 8,
"default_tags": {
"Vendor": "Databricks",
"Creator": "someone@example.com",
"ClusterName": "my-cluster",
"ClusterId": "1234-567890-batch123",
"JobId": "1268284",
"RunName": "Normal job"
},
"creator_user_name": "someone@example.com",
"termination_reason": {
"code": "JOB_FINISHED",
"type": "SUCCESS"
},
"init_scripts_safe_mode": false
}
列出有关所有可用群集的信息
若要显示使用情况文档,请运行 databricks clusters list --help
。
databricks clusters list --output JSON | jq .
{
"clusters": [
{
"cluster_id": "1234-567890-batch123",
"spark_context_id": 3124308392469747564,
"cluster_name": "my-cluster",
"spark_version": "7.5.x-scala2.12",
"spark_conf": {
"spark.databricks.delta.preview.enabled": "true"
},
"node_type_id": "Standard_DS3_v2",
"driver_node_type_id": "Standard_DS3_v2",
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},
"autotermination_minutes": 0,
"enable_elastic_disk": true,
"disk_spec": {},
"cluster_source": "JOB",
"enable_local_disk_encryption": false,
"azure_attributes": {
"first_on_demand": 1,
"availability": "ON_DEMAND_AZURE",
"spot_bid_max_price": -1.0
},
"instance_source": {
"node_type_id": "Standard_DS3_v2"
},
"driver_instance_source": {
"node_type_id": "Standard_DS3_v2"
},
"state": "TERMINATED",
"state_message": "",
"start_time": 1619563745373,
"terminated_time": 1619563822867,
"last_state_loss_time": 0,
"num_workers": 8,
"default_tags": {
"Vendor": "Databricks",
"Creator": "someone@example.com",
"ClusterName": "my-cluster",
"ClusterId": "1234-567890-batch123",
"JobId": "1268284",
"RunName": "Normal job"
},
"creator_user_name": "someone@example.com",
"termination_reason": {
"code": "JOB_FINISHED",
"type": "SUCCESS"
},
"init_scripts_safe_mode": false
},
...
]
}
列出可用的群集节点类型
若要显示使用情况文档,请运行 databricks clusters list-node-types --help
。
databricks clusters list-node-types
{
"node_types": [
{
"node_type_id": "Standard_L80s_v2",
"memory_mb": 655360,
"num_cores": 80.0,
"description": "Standard_L80s_v2",
"instance_type_id": "Standard_L80s_v2",
"is_deprecated": false,
"category": "Storage Optimized",
"support_ebs_volumes": true,
"support_cluster_tags": true,
"num_gpus": 0,
"node_instance_type": {
"instance_type_id": "Standard_L80s_v2",
"local_disks": 1,
"local_disk_size_gb": 800,
"instance_family": "Standard LSv2 Family vCPUs",
"local_nvme_disk_size_gb": 1788,
"local_nvme_disks": 10,
"swap_size": "10g"
},
"is_hidden": false,
"support_port_forwarding": true,
"display_order": 0,
"is_io_cache_enabled": true,
"node_info": {
"available_core_quota": 350.0,
"total_core_quota": 350.0
}
},
...
]
}
列出用于创建群集的可用区域
注意
此命令不适用于 Azure Databricks。
若要显示使用情况文档,请运行 databricks clusters list-zones --help
。
databricks clusters list-zones
永久删除群集
若要显示使用情况文档,请运行 databricks clusters permanent-delete --help
。
databricks clusters permanent-delete --cluster-id 1234-567890-batch123
如果成功,则不显示任何输出。
调整群集大小
若要显示使用情况文档,请运行 databricks clusters resize --help
。
databricks clusters resize --cluster-id 1234-567890-batch123 --num-workers 10
如果成功,则不显示任何输出。
重启群集
若要显示使用情况文档,请运行 databricks clusters restart --help
。
databricks clusters restart --cluster-id 1234-567890-batch123
如果成功,则不显示任何输出。
列出可用的 Spark 运行时版本
若要显示使用情况文档,请运行 databricks clusters spark-versions --help
。
databricks clusters spark-versions
{
"versions": [
{
"key": "8.2.x-scala2.12",
"name": "8.2 (includes Apache Spark 3.1.1, Scala 2.12)"
},
...
]
}
启动群集
若要显示使用情况文档,请运行 databricks clusters start --help
。
databricks clusters start --cluster-id 1234-567890-batch123
如果成功,则不显示任何输出。