CLI (v2) batch deployment YAML schema

APPLIES TO: Azure CLI ml extension v2 (current)

The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json.

Note

The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.

YAML syntax

Key	Type	Description	Allowed values
`$schema`	string	The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including `$schema` at the top of your file enables you to invoke schema and resource completions.
`name`	string	Required. Name of the deployment.
`description`	string	Description of the deployment.
`tags`	object	Dictionary of tags for the deployment.
`endpoint_name`	string	Required. Name of the endpoint to create the deployment under.
`type`	string	Type of the batch deployment. Use `model` for model deployments and `pipeline` for pipeline component deployments. If not specified, defaults to classic batch deployment New in version 1.7.	`model`, `pipeline`
`settings`	object	Configuration of the deployment. See specific YAML reference for model and pipeline component for allowed values. New in version 1.7.

Tip

The key type has been introduced in version 1.7 of the CLI extension and above. To fully support backward compatibility, this property defaults to model. However, if not explicitly indicated, the key settings is not enforced and all the properties for the model deployment settings should be indicated in to root of the YAML specification.

YAML syntax for model deployments

When type: model, the following syntax is enforced:

Key	Type	Description	Allowed values	Default value
`model`	string or object	Required. The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification. To reference an existing model, use the `azureml:<model-name>:<version>` syntax. To define a model inline, follow the Model schema. As a best practice for production scenarios, you should create the model separately and reference it here.
`code_configuration`	object	Configuration for the scoring code logic. This property is not required if your model is in MLflow format.
`code_configuration.code`	string	The local directory that contains all the Python source code to score the model.
`code_configuration.scoring_script`	string	The Python file in the above directory. This file must have an `init()` function and a `run()` function. Use the `init()` function for any costly or common preparation (for example, load the model in memory). `init()` is called only once at beginning of process. Use `run(mini_batch)` to score each entry; the value of `mini_batch` is a list of file paths. The `run()` function should return a pandas DataFrame or an array. Each returned element indicates one successful run of input element in the `mini_batch`.
`environment`	string or object	The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification. This property is not required if your model is in MLflow format. To reference an existing environment, use the `azureml:<environment-name>:<environment-version>` syntax. To define an environment inline, follow the Environment schema. As a best practice for production scenarios, you should create the environment separately and reference it here.
`compute`	string	Required. Name of the compute target to execute the batch scoring jobs on. This value should be a reference to an existing compute in the workspace using the `azureml:<compute-name>` syntax.
`resources.instance_count`	integer	The number of nodes to use for each batch scoring job.		`1`
`settings`	object	Specific configuration of the model deployment. Changed in version 1.7.
`settings.max_concurrency_per_instance`	integer	The maximum number of parallel `scoring_script` runs per instance.		`1`
`settings.error_threshold`	integer	The number of file failures that should be ignored. If the error count for the entire input goes above this value, the batch scoring job is terminated. `error_threshold` is for the entire input and not for individual mini batches. If omitted, any number of file failures is allowed without terminating the job.		`-1`
`settings.logging_level`	string	The log verbosity level.	`warning`, `info`, `debug`	`info`
`settings.mini_batch_size`	integer	The number of files the `code_configuration.scoring_script` can process in one `run()` call.		`10`
`settings.retry_settings`	object	Retry settings for scoring each mini batch.
`settings.retry_settings.max_retries`	integer	The maximum number of retries for a failed or timed-out mini batch.		`3`
`settings.retry_settings.timeout`	integer	The timeout in seconds for scoring a single mini batch. Use larger values when the mini-batch size is bigger or the model is more expensive to run.		`30`
`settings.output_action`	string	Indicates how the output should be organized in the output file. Use `summary_only` if you are generating the output files as indicated at Customize outputs in model deployments. Use `append_row` if you are returning predictions as part of the `run()` function `return` statement.	`append_row`, `summary_only`	`append_row`
`settings.output_file_name`	string	Name of the batch scoring output file.		`predictions.csv`
`settings.environment_variables`	object	Dictionary of environment variable key-value pairs to set for each batch scoring job.

Remarks

The az ml batch-deployment commands can be used for managing Azure Machine Learning batch deployments.

Examples

Examples are available in the examples GitHub repository. Some of them are referenced below:

YAML: MLflow model deployment

A model deployment containing an MLflow model, which doesn't require to indicate code_configuration or environment:

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
type: model
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 2
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 300
  error_threshold: -1
  logging_level: info

YAML: Custom model deployment with scoring script

A model deployment indicating the scoring script to use and the environment:

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: mnist-torch-dpl
description: A deployment using Torch to solve the MNIST classification dataset.
endpoint_name: mnist-batch
type: model
model:
  name: mnist-classifier-torch
  path: model
code_configuration:
  code: code
  scoring_script: batch_driver.py
environment:
  name: batch-torch-py38
  image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
  conda_file: environment/conda.yaml
compute: azureml:batch-cluster
resources:
  instance_count: 1
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 10
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 30
  error_threshold: -1
  logging_level: info

YAML: Legacy model deployments

If the attribute type is not indicated in the YAML, then a model deployment is inferred. However, the key settings will not be available and the properties should be placed in the root of the YAML as indicated in this example. It's strongly advisable to always specify the property type.

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
max_concurrency_per_instance: 2
mini_batch_size: 2
output_action: append_row
output_file_name: predictions.csv
retry_settings:
  max_retries: 3
  timeout: 300
error_threshold: -1
logging_level: info

YAML: Pipeline component deployment

A simple pipeline component deployment:

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: hello-batch-dpl
endpoint_name: hello-pipeline-batch
type: pipeline
component: azureml:hello_batch@latest
settings:
    default_compute: batch-cluster

Next steps

Install and use the CLI (v2)

Last updated on 2025-08-18