CLI (v2) 命令作业 YAML 架构

APPLIES TO：Azure CLI ml extension v2 （current）

源 JSON 架构可在 https://azuremlschemas.azureedge.net/latest/commandJob.schema.json 中找到。

注意

本文档中详细介绍的 YAML 语法基于最新版本的 ML CLI v2 扩展的 JSON 架构。此语法必定仅适用于最新版本的 ML CLI v2 扩展。可以在 https://azuremlschemasprod.azureedge.net/ 上查找早期扩展版本的架构。

YAML 语法

密钥	类型	说明	允许的值	默认值
`$schema`	字符串	YAML 架构。如果使用 Azure Machine Learning VS Code 扩展创作 YAML 文件，包括文件顶部的 `$schema`，则可以调用架构和资源完成。
`type`	常量	作业类型。	`command`	`command`
`name`	字符串	作业的名称。对工作区中的所有作业必须唯一。如果省略，Azure Machine Learning自动生成名称的 GUID。
`display_name`	字符串	作业在工作室 UI 中的显示名称。在工作区中可以不唯一。如果省略，Azure Machine Learning自动生成显示名称的可读形容词-名词标识符。
`experiment_name`	字符串	用于对作业进行组织的试验名称。每个作业的运行记录都按工作室的“试验”选项卡中的相应试验进行组织。如果省略，Azure Machine Learning默认为创建作业的工作目录的名称。
`description`	字符串	作业的说明。
`tags`	对象	作业的标记字典。
`command`	字符串	要执行的命令。
`code`	字符串	要上传并用于作业的源代码目录的本地路径。
`environment`	字符串或对象	用于作业的环境。可以是对工作区中现有版本受控环境的引用，也可以是对内联环境规范的引用。若要引用现有环境，请使用 `azureml:<environment_name>:<environment_version>` 语法或 `azureml:<environment_name>@latest`（引用环境的最新版本）。若要以内联方式定义环境，请遵循环境架构。排除 `name` 和 `version` 属性，因为内联环境不支持这些属性。在 CLI 或 SDK 中使用特选环境时，特选的环境名称以 `AzureML-`开头。使用Azure Machine Learning studio时，特选环境名称没有此前缀。造成这种差异的原因是工作室 UI 在单独的选项卡上显示特选环境和自定义环境，因此不需要前缀。 CLI 和 SDK 没有这种分离，因此使用前缀来区分特选环境和自定义环境。
`environment_variables`	对象	要在执行命令的进程上设置的环境变量键/值对的字典。
`distribution`	对象	分布式训练方案的分布配置。 MpiConfiguration、PyTorchConfiguration、TensorFlowConfiguration 或 RayConfiguration 之一。
`compute`	字符串	要在其上执行作业的计算目标的名称。可以是对工作区中现有计算的引用（使用 `azureml:<compute_name>` 语法），也可以是对 `local` 的引用，以指定本地执行。注意：管道中的作业不支持将作为 `local`		`local`
`resources.instance_count`	整数	用于作业的节点数。		`1`
`resources.instance_type`	字符串	用于作业的实例类型。适用于在启用了Azure Arc的 Kubernetes 计算上运行的作业（其中，`compute` 字段中指定的计算目标为 `type: kubernetes`）。如果省略，则默认为 Kubernetes 群集的默认实例类型。有关详细信息，请参阅 “创建和管理实例类型”。
`resources.shm_size`	字符串	Docker 容器的共享内存块的大小。应采用 `<number><unit>` 格式，其中数字必须大于 0，单位可以是 `b`（字节）、`k`（千字节）、`m`（兆字节）或 `g`（千兆字节）之一。		`2g`
`resources.docker_args`	字符串	要传递给 Docker `run` 命令的额外参数。
`resources.locations`	数组	允许运行作业的区域位置列表。
`resources.max_instance_count`	整数	用于作业的最大节点数（用于弹性分布式训练）。
`limits.timeout`	整数	允许作业运行的最长时间（秒）。达到此限制时，系统会取消作业。
`inputs`	对象	作业的输入字典。键是作业上下文中的输入名称，值是输入值。可以在 `command` 中使用 `${{ inputs.<input_name> }}` 表达式引用输入。
`inputs.<input_name>`	数字、整数、布尔值、字符串或对象	文字值（数字、整数、布尔值或字符串类型）或包含作业输入数据规范的对象之一。
`outputs`	对象	作业的输出配置字典。键是作业上下文中的输出名称，值是输出配置。可以在 `command` 中使用 `${{ outputs.<output_name> }}` 表达式引用输出。
`outputs.<output_name>`	对象	可以将对象留空，在这种情况下，输出类型为 `uri_folder`，Azure Machine Learning生成输出的输出位置。输出目录的文件会通过读写装载写入。如果要为输出指定不同的模式，请提供一个包含作业输出规范的对象。
`queue_settings`	对象	作业的队列设置。配置作业层和计划优先级。请参阅队列设置。
`services`	对象	交互式作业服务字典（终结点）。支持的服务类型：`ssh`、、`tensor_boardvs_code`、`jupyter_lab`。
`identity`	对象	此标识用于数据访问。可以是 UserIdentityConfiguration、 ManagedIdentityConfiguration、 AMLTokenIdentityConfiguration 或 None。如果 UserIdentityConfiguration，则作业提交者的标识用于访问输入数据并将结果写入输出文件夹;否则，将使用计算目标的托管标识。

分布配置

MpiConfiguration

密钥	类型	说明	允许的值
`type`	常量	必需。分布类型。	`mpi`
`process_count_per_instance`	整数	必需。要为作业启动的每节点进程数。

PyTorchConfiguration

密钥	类型	说明	允许的值	默认值
`type`	常量	必需。分布类型。	`pytorch`
`process_count_per_instance`	整数	要为作业启动的每节点进程数。		`1`

TensorFlowConfiguration

密钥	类型	说明	允许的值	默认值
`type`	常量	必需。分布类型。	`tensorflow`
`worker_count`	整数	要为作业启动的工作线程数。		默认为 `resources.instance_count`。
`parameter_server_count`	整数	要为作业启动的参数服务器数。		`0`

RayConfiguration

密钥	类型	说明	允许的值
`type`	常量	必需。分布类型。	`ray`
`address`	字符串	要连接到的现有 Ray 群集的地址。如果省略，Azure Machine Learning启动新的 Ray 群集。
`port`	整数	头部 Ray 进程的端口。
`dashboard_port`	整数	Ray 仪表板过程的端口。
`include_dashboard`	布尔	是否启动 Ray 仪表板。
`head_node_additional_args`	字符串	传递到 `ray start` 头节点上的额外参数。
`worker_node_additional_args`	字符串	传递给 `ray start` 工作器节点上的额外参数。

作业输入

密钥	类型	说明	允许的值	默认值
`type`	字符串	作业输入的类型。为指向单个文件源的输入数据指定 `uri_file`，或为指向文件夹源的输入数据指定 `uri_folder`。	`uri_file`，`uri_folder`，`mlflow_model`，`custom_model`	`uri_folder`
`path`	字符串	用作输入的数据的路径。可以通过几种方式进行指定： - 数据源文件或文件夹的本地路径，例如 `path: ./iris.csv`。数据在作业提交期间上传。 - 要用作输入的文件或文件夹的云路径的 URI。支持的 URI 类型为 `azureml`、`https`、`wasbs`、`abfss`、`adl`。有关如何使用 URI 格式的详细信息，请参阅`azureml://`。 - 要用作输入的现有已注册Azure Machine Learning数据资产。若要引用已注册的数据资产，请使用 `azureml:<data_name>:<data_version>` 语法或 `azureml:<data_name>@latest`（用于引用数据资产的最新版本），例如 `path: azureml:cifar10-data:1` 或 `path: azureml:cifar10-data@latest`。
`mode`	字符串	将数据传送到计算目标的模式。对于只读装载（`ro_mount`），该数据将用作装载路径。文件夹是文件夹装载的，而文件则作为文件装载。 Azure Machine Learning将输入解析为装载路径。对于 `download` 模式，数据将下载到计算目标。 Azure Machine Learning将输入解析为下载的路径。如果只想要数据项目的存储位置的 URL，而不是装载或下载数据本身，则可以使用 `direct` 模式。此模式将存储位置的 URL 作为作业输入传入。在这种情况下，你全权负责处理凭证以访问存储。 `eval_mount` 和 `eval_download` 模式对于 MLTable 是唯一的，并且将数据装载为路径或将数据下载到计算目标。有关详细信息，请参阅访问作业中的数据	`ro_mount`，`download`，`direct`，`eval_download`，`eval_mount`	`ro_mount`

作业输出

密钥	类型	说明	允许的值	默认值
`type`	字符串	作业输出的类型。对于默认的 `uri_folder` 类型，输出对应于某个文件夹。	`uri_folder`，`mlflow_model`，`custom_model`	`uri_folder`
`mode`	字符串	输出文件如何传送到目标存储的模式。对于读写装载模式 (`rw_mount`)，输出目录是装载的目录。对于上传模式，写入的文件在作业结束时上传。	`rw_mount`，`upload`	`rw_mount`

标识配置

UserIdentityConfiguration

密钥	类型	说明	允许的值
`type`	常量	必需。标识类型。	`user_identity`

ManagedIdentityConfiguration

密钥	类型	说明	允许的值
`type`	常量	必需。标识类型。	`managed` 或 `managed_identity`

AMLTokenIdentityConfiguration

密钥	类型	说明	允许的值
`type`	常量	必需。标识类型。该作业使用工作区的Azure Machine Learning令牌进行数据访问。	`aml_token`

队列设置

密钥	类型	说明	允许的值	默认值
`job_tier`	字符串	作业层。 `spot` 使用成本较低的抢占计算。	`spot`，`basic`，`standard`，`premium`
`priority`	字符串	所选层中的计划优先级。	`low`、`medium`、`high`

备注

az ml job 命令可用于管理Azure Machine Learning作业。

示例

examples GitHub 存储库提供了示例。以下各部分显示了一些示例。

YAML：hello world

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: library/python:latest

YAML：显示名称、试验名称、说明和标记

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: library/python:latest
tags:
  hello: world
display_name: hello-world-example
experiment_name: hello-world-example
description: |
  # Azure Machine Learning "hello world" job

  This is a "hello world" job running in the cloud via Azure Machine Learning!

  ## Description

  Markdown is supported in the studio for job descriptions! You can edit the description there or via CLI.

YAML：环境变量

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo $hello_env_var
environment:
  image: library/python:latest
environment_variables:
  hello_env_var: "hello world"

YAML：源代码

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: ls
code: src
environment:
  image: library/python:latest

YAML：文字输入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo ${{inputs.hello_string}}
  echo ${{inputs.hello_number}}
environment:
  image: library/python:latest
inputs:
  hello_string: "hello world"
  hello_number: 42

YAML：写入默认输出

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ./outputs/helloworld.txt
environment:
  image: library/python:latest

YAML：写入命名数据输出

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ${{outputs.hello_output}}/helloworld.txt
outputs:
  hello_output:
environment:
  image: python

YAML：数据存储 URI 文件输入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo "--iris-csv: ${{inputs.iris_csv}}"
  python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code: src
inputs:
  iris_csv:
    type: uri_file 
    path: azureml://datastores/workspaceblobstore/paths/example-data/iris.csv
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest

YAML：数据存储 URI 文件夹输入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  ls ${{inputs.data_dir}}
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code: src
inputs:
  data_dir:
    type: uri_folder 
    path: azureml://datastores/workspaceblobstore/paths/example-data/
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest

YAML：URI 文件输入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo "--iris-csv: ${{inputs.iris_csv}}"
  python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code: src
inputs:
  iris_csv:
    type: uri_file 
    path: https://azuremlexamples.blob.core.chinacloudapi.cn/datasets/iris.csv
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest

YAML：URI 文件夹输入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  ls ${{inputs.data_dir}}
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code: src
inputs:
  data_dir:
    type: uri_folder 
    path: wasbs://datasets@azuremlexamples.blob.core.chinacloudapi.cn/
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest

YAML：通过 papermill 的笔记本

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  pip install ipykernel papermill
  papermill hello-notebook.ipynb outputs/out.ipynb -k python
code: src
environment:
  image: library/python:3.11.6

YAML：基本Python模型训练

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python main.py 
  --iris-csv ${{inputs.iris_csv}}
  --C ${{inputs.C}}
  --kernel ${{inputs.kernel}}
  --coef0 ${{inputs.coef0}}
inputs:
  iris_csv: 
    type: uri_file
    path: wasbs://datasets@azuremlexamples.blob.core.chinacloudapi.cn/iris.csv
  C: 0.8
  kernel: "rbf"
  coef0: 0.1
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest
compute: azureml:cpu-cluster
display_name: sklearn-iris-example
experiment_name: sklearn-iris-example
description: Train a scikit-learn SVM on the Iris dataset.

YAML：使用本地 Docker 生成上下文进行基本 R 模型训练

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >
  Rscript train.R 
  --data_folder ${{inputs.iris}}
code: src
inputs:
  iris: 
    type: uri_file
    path: https://azuremlexamples.blob.core.chinacloudapi.cn/datasets/iris.csv
environment:
  build:
    path: docker-context
compute: azureml:cpu-cluster
display_name: r-iris-example
experiment_name: r-iris-example
description: Train an R model on the Iris dataset.

YAML：分布式 PyTorch

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
  --learning-rate ${{inputs.learning_rate}}
  --data-dir ${{inputs.cifar}}
inputs:
  epochs: 1
  learning_rate: 0.2
  cifar:
     type: uri_folder
     path: azureml:cifar-10-example@latest
environment: azureml:AzureML-acpt-pytorch-1.13-cuda11.7@latest
compute: azureml:gpu-cluster
distribution:
  type: pytorch
  process_count_per_instance: 1
resources:
  instance_count: 2
display_name: pytorch-cifar-distributed-example
experiment_name: pytorch-cifar-distributed-example
description: Train a basic convolutional neural network (CNN) with PyTorch on the CIFAR-10 dataset, distributed via PyTorch.

YAML：分布式 TensorFlow

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
  --model-dir ${{inputs.model_dir}}
inputs:
  epochs: 1
  model_dir: outputs/keras-model
environment: azureml:AzureML-tensorflow-2.12-cuda11@latest
compute: azureml:gpu-cluster
resources:
  instance_count: 2
distribution:
  type: tensorflow
  worker_count: 2
display_name: tensorflow-mnist-distributed-example
experiment_name: tensorflow-mnist-distributed-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via TensorFlow.

YAML：分布式 MPI

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
inputs:
  epochs: 1
environment: azureml:AzureML-tensorflow-2.12-cuda11@latest
compute: azureml:gpu-cluster
resources:
  instance_count: 2
distribution:
  type: mpi
  process_count_per_instance: 1
display_name: tensorflow-mnist-distributed-horovod-example
experiment_name: tensorflow-mnist-distributed-horovod-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via Horovod.

后续步骤

安装并使用 CLI (v2)

Last updated on 2026-04-22