CLI (v2) command component YAML schema
APPLIES TO: Azure CLI ml extension v2 (current)
The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json.
Note
The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.
YAML syntax
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
$schema |
string | The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including $schema at the top of your file enables you to invoke schema and resource completions. |
||
type |
const | The type of component. | command |
command |
name |
string | Required. Name of the component. Must start with lowercase letter. Allowed characters are lowercase letters, numbers, and underscore(_). Maximum length is 255 characters. | ||
version |
string | Version of the component. If omitted, Azure Machine Learning will autogenerate a version. | ||
display_name |
string | Display name of the component in the studio UI. Can be non-unique within the workspace. | ||
description |
string | Description of the component. | ||
tags |
object | Dictionary of tags for the component. | ||
is_deterministic |
boolean | This option determines if the component will produce the same output for the same input data. You should usually set this to false for components that load data from external sources, such as importing data from a URL. This is because the data at the URL might change over time. |
true |
|
command |
string | Required. The command to execute. | ||
code |
string | Local path to the source code directory to be uploaded and used for the component. | ||
environment |
string or object | Required. The environment to use for the component. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. To reference an existing environment, use the azureml:<environment-name>:<environment-version> syntax. To define an environment inline, follow the Environment schema. Exclude the name and version properties as they aren't supported for inline environments. |
||
distribution |
object | The distribution configuration for distributed training scenarios. One of MpiConfiguration, PyTorchConfiguration, or TensorFlowConfiguration. | ||
resources.instance_count |
integer | The number of nodes to use for the job. | 1 |
|
inputs |
object | Dictionary of component inputs. The key is a name for the input within the context of the component and the value is the component input definition. Inputs can be referenced in the command using the ${{ inputs.<input_name> }} expression. |
||
inputs.<input_name> |
object | The component input definition. See Component input for the set of configurable properties. | ||
outputs |
object | Dictionary of component outputs. The key is a name for the output within the context of the component and the value is the component output definition. Outputs can be referenced in the command using the ${{ outputs.<output_name> }} expression. |
||
outputs.<output_name> |
object | The component output definition. See Component output for the set of configurable properties. |
Distribution configurations
MpiConfiguration
Key | Type | Description | Allowed values |
---|---|---|---|
type |
const | Required. Distribution type. | mpi |
process_count_per_instance |
integer | Required. The number of processes per node to launch for the job. |
PyTorchConfiguration
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
const | Required. Distribution type. | pytorch |
|
process_count_per_instance |
integer | The number of processes per node to launch for the job. | 1 |
TensorFlowConfiguration
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
const | Required. Distribution type. | tensorflow |
|
worker_count |
integer | The number of workers to launch for the job. | Defaults to resources.instance_count . |
|
parameter_server_count |
integer | The number of parameter servers to launch for the job. | 0 |
Component input
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
string | Required. The type of component input. Learn more about data access | number , integer , boolean , string , uri_file , uri_folder , mltable , mlflow_model |
|
description |
string | Description of the input. | ||
default |
number, integer, boolean, or string | The default value for the input. | ||
optional |
boolean | Whether the input is required. If set to true , you need use the command includes optional inputs with $[[]] |
false |
|
min |
integer or number | The minimum accepted value for the input. This field can only be specified if type field is number or integer . |
||
max |
integer or number | The maximum accepted value for the input. This field can only be specified if type field is number or integer . |
||
enum |
array | The list of allowed values for the input. Only applicable if type field is string . |
Component output
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
type |
string | Required. The type of component output. | uri_file , uri_folder , mltable , mlflow_model |
|
description |
string | Description of the output. |
Remarks
The az ml component
commands can be used for managing Azure Machine Learning components.
Examples
Command component examples are available in the examples GitHub repository. Select examples for are shown below.
Examples are available in the examples GitHub repository. Several are shown below.
YAML: Hello world command component
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command
name: hello_python_world
display_name: Hello_Python_World
version: 1
code: ./src
environment:
image: python
command: >-
python hello.py
YAML: Component with different input types
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
name: train_data_component_cli
display_name: train_data
description: A example train component
tags:
author: azureml-sdk-team
version: 9
type: command
inputs:
training_data:
type: uri_folder
max_epocs:
type: integer
optional: true
learning_rate:
type: number
default: 0.01
optional: true
learning_rate_schedule:
type: string
default: time-based
optional: true
outputs:
model_output:
type: uri_folder
code: ./train_src
environment: azureml://registries/azureml/environments/sklearn-1.0/labels/latest
command: >-
python train.py
--training_data ${{inputs.training_data}}
$[[--max_epocs ${{inputs.max_epocs}}]]
$[[--learning_rate ${{inputs.learning_rate}}]]
$[[--learning_rate_schedule ${{inputs.learning_rate_schedule}}]]
--model_output ${{outputs.model_output}}
Define optional inputs in command line
When the input is set as optional = true
, you need use $[[]]
to embrace the command line with inputs. For example $[[--input1 ${{inputs.input1}}]
. The command line at runtime may have different inputs.
- If you're using only specify the required
training_data
andmodel_output
parameters, the command line will look like:
python train.py --training_data some_input_path --learning_rate 0.01 --learning_rate_schedule time-based --model_output some_output_path
If no value is specified at runtime, learning_rate
and learning_rate_schedule
will use the default value.
- If all inputs/outputs provide values during runtime, the command line will look like:
python train.py --training_data some_input_path --max_epocs 10 --learning_rate 0.01 --learning_rate_schedule time-based --model_output some_output_path
Common errors and recommendation
Following are some common errors and corresponding recommended suggestions when you define a component.
Key | Errors | Recommendation |
---|---|---|
command | 1. Only optional inputs can be in $[[]] 2. Using \ to make a new line isn't supported in command.3. Inputs or outputs aren't found. |
1. Check that all the inputs or outputs used in command are already defined in the inputs and outputs sections, and use the correct format for optional inputs $[[]] or required ones ${{}} .2. Don't use \ to make a new line. |
environment | 1. No definition exists for environment {envName} version {envVersion} . 2. No environment exists for name {envName} , version {envVersion} .3. Couldn't find asset with ID {envAssetId} . |
1. Make sure the environment name and version you refer in the component definition exists. 2. You need to specify the version if you refer to a registered environment. |
inputs/outputs | 1. Inputs/outputs names conflict with system reserved parameters. 2. Duplicated names of inputs or outputs. |
1. Don't use any of these reserved parameters as your inputs/outputs name: path , ld_library_path , user , logname , home , pwd , shell .2. Make sure names of inputs and outputs aren't duplicated. |