Image processing with batch model deployments

2025-07-31

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

You can use batch model deployments for processing tabular data, but also any other file types, like images. Those deployments are supported in both MLflow and custom models. In this article, you learn how to deploy a model that classifies images according to the ImageNet taxonomy.

Prerequisites

An Azure subscription. If you don't have an Azure subscription, create a Trial before you begin.
An Azure Machine Learning workspace. To create a workspace, see Manage Azure Machine Learning workspaces.
The following permissions in the Azure Machine Learning workspace:
- For creating or managing batch endpoints and deployments: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.MachineLearningServices/workspaces/batchEndpoints/* permissions.
- For creating Azure Resource Manager deployments in the workspace resource group: Use an Owner, Contributor, or custom role that has been assigned the Microsoft.Resources/deployments/write permission in the resource group where the workspace is deployed.
The Azure Machine Learning CLI or the Azure Machine Learning SDK for Python:
- Azure CLI
- Python
Run the following command to install the Azure CLI and the ml extension for Azure Machine Learning:
```
az extension add -n ml
```
Pipeline component deployments for batch endpoints are introduced in version 2.7 of the ml extension for the Azure CLI. Use the az extension update --name ml command to get the latest version.
Run the following command to install the Azure Machine Learning SDK for Python:
```
pip install azure-ai-ml
```
The ModelBatchDeployment and PipelineComponentBatchDeployment classes are introduced in version 1.7.0 of the SDK. Use the pip install -U azure-ai-ml command to get the latest version.

Connect to your workspace

The workspace is the top-level resource for Azure Machine Learning. It provides a centralized place to work with all artifacts you create when you use Azure Machine Learning. In this section, you connect to the workspace where you perform your deployment tasks.

Azure CLI
Python

In the following command, enter your subscription ID, workspace name, resource group name, and location:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Import the required libraries:

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

Configure the workspace details and get a handle to the workspace:

In the following command, enter your subscription ID, resource group name, and workspace name:

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

About this sample

This article uses a model that was built using TensorFlow along with the RestNet architecture. For more information, see Identity Mappings in Deep Residual Networks. You can download a sample of this model. The model has the following constraints:

It works with images of size 244x244 (tensors of (224, 224, 3)).
It requires inputs to be scaled to the range [0,1].

The information in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo. Change directories to cli/endpoints/batch/deploy-models/imagenet-classifier if you're using the Azure CLI or sdk/python/endpoints/batch/deploy-models/imagenet-classifier if you're using the SDK for Python.

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli/endpoints/batch/deploy-models/imagenet-classifier

Follow along in Jupyter Notebooks

You can follow along this sample in a Jupyter Notebook. In the cloned repository, open the notebook: imagenet-classifier-batch.ipynb.

Image classification with batch deployments

In this example, you learn how to deploy a deep learning model that can classify a given image according to the taxonomy of ImageNet.

Create the endpoint

Create the endpoint that hosts the model:

Azure CLI
Python

Specify the name of the endpoint.

ENDPOINT_NAME="imagenet-classifier-batch"

Create the following YAML file to define the batch endpoint, named endpoint.yml:

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: imagenet-classifier-batch
description: A batch endpoint for performing image classification using a TFHub model ImageNet model.
auth_mode: aad_token

To create the endpoint, run the following code:

az ml batch-endpoint create --file endpoint.yml  --name $ENDPOINT_NAME

Specify the name of the endpoint.

endpoint_name="imagenet-classifier-batch"

Configure the endpoint.

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="An batch service to perform ImageNet image classification",
)

To create the endpoint, run the following code:

ml_client.batch_endpoints.begin_create_or_update(endpoint)

Register the model

Model deployments can only deploy registered models. You need to register the model. You can skip this step if the model you're trying to deploy is already registered.

Download a copy of the model.

Azure CLI
Python

wget https://azuremlexampledata.blob.core.chinacloudapi.cn/data/imagenet/model.zip
mkdir -p imagenet-classifier
unzip model.zip -d imagenet-classifier

import os
import urllib.request
from zipfile import ZipFile

response = urllib.request.urlretrieve('https://azuremlexampledata.blob.core.chinacloudapi.cn/data/imagenet/model.zip', 'model.zip')

os.mkdirs("imagenet-classifier", exits_ok=True)
with ZipFile(response[0], 'r') as zip:
  model_path = zip.extractall(path="imagenet-classifier")

Azure CLI
Python

MODEL_NAME='imagenet-classifier'
az ml model create --name $MODEL_NAME --path "model"

model_name = 'imagenet-classifier'
model = ml_client.models.create_or_update(
    Model(name=model_name, path=model_path, type=AssetTypes.CUSTOM_MODEL)
)

Create a scoring script

Create a scoring script that can read the images provided by the batch deployment and return the scores of the model.

The init method loads the model using the keras module in tensorflow.
The run method runs for each mini-batch the batch deployment provides.
The run method reads one image of the file at a time.
The run method resizes the images to the expected sizes for the model.
The run method rescales the images to the range [0,1] domain, which is what the model expects.
The script returns the classes and the probabilities associated with the predictions.

This code is the code/score-by-file/batch_driver.py file:

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from os.path import basename
from PIL import Image
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244


def run(mini_batch):
    results = []

    for image in mini_batch:
        data = Image.open(image).resize(
            (input_width, input_height)
        )  # Read and resize the image
        data = np.array(data) / 255.0  # Normalize
        data_batch = tf.expand_dims(
            data, axis=0
        )  # create a batch of size (1, 244, 244, 3)

        # perform inference
        pred = model.predict(data_batch)

        # Compute probabilities, classes and labels
        pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
        pred_class = tf.math.argmax(pred, axis=-1).numpy()

        results.append([basename(image), pred_class[0], pred_prob])

    return pd.DataFrame(results)

Tip

Although images are provided in mini-batches by the deployment, this scoring script processes one image at a time. This is a common pattern because trying to load the entire batch and send it to the model at once might result in high-memory pressure on the batch executor (OOM exceptions).

There are certain cases where doing so enables high throughput in the scoring task. This is the case for batch deployments over GPU hardware where you want to achieve high GPU utilization. For a scoring script that takes advantage of this approach, see High throughput deployments.

Note

If you want to deploy a generative model, which generates files, learn how to author a scoring script: Customize outputs in batch deployments.

Create the deployment

After you create the scoring script, create a batch deployment for it. Use the following procedure:

Ensure that you have a compute cluster created where you can create the deployment. In this example, use a compute cluster named gpu-cluster. Although not required, using GPUs speeds up the processing.
Indicate over which environment to run the deployment. In this example, the model runs on TensorFlow. Azure Machine Learning already has an environment with the required software installed, so you can reuse this environment. You need to add a couple of dependencies in a conda.yml file.
- Azure CLI
- Python
The environment definition is included in the deployment file.
```
compute: azureml:gpu-cluster
environment:
  name: tensorflow27-cuda11-gpu
  image: mcr.microsoft.com/azureml/curated/tensorflow-2.7-ubuntu20.04-py38-cuda11-gpu:latest
```
Get a reference to the environment.
```
environment = Environment(
    name="tensorflow27-cuda11-gpu",
    conda_file="environment/conda.yml",
    image="mcr.microsoft.com/azureml/curated/tensorflow-2.7-ubuntu20.04-py38-cuda11-gpu:latest",
)
```

Create the deployment.

Azure CLI
Python

To create a new deployment under the created endpoint, create a YAML configuration like the following example. For other properties, see the full batch endpoint YAML schema.

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: imagenet-classifier-batch
name: imagenet-classifier-resnetv2
description: A ResNetV2 model architecture for performing ImageNet classification in batch
type: model
model: azureml:imagenet-classifier@latest
compute: azureml:gpu-cluster
environment:
  name: tensorflow27-cuda11-gpu
  image: mcr.microsoft.com/azureml/curated/tensorflow-2.7-ubuntu20.04-py38-cuda11-gpu:latest
  conda_file: environment/conda.yaml
code_configuration:
  code: code/score-by-file
  scoring_script: batch_driver.py
resources:
  instance_count: 2
settings:
  max_concurrency_per_instance: 1
  mini_batch_size: 5
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 300
 error_threshold: -1
 logging_level: info

Create the deployment with the following command:

DEPLOYMENT_NAME="imagenet-classifier-resnetv2"
az ml batch-deployment create -f deployment.yml

To create a new deployment with the indicated environment and scoring script, use the following code:

deployment = BatchDeployment(
    name="imagenet-classifier-resnetv2",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch",
    endpoint_name=endpoint.name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="code/score-by-file",
        scoring_script="batch_driver.py",
    ),
    compute=compute_name,
    instance_count=2,
    max_concurrency_per_instance=1,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)

Create the deployment with the following command:

ml_client.batch_deployments.begin_create_or_update(deployment)

Although you can invoke a specific deployment inside of an endpoint, you usually want to invoke the endpoint itself, and let the endpoint decide which deployment to use. Such deployment is called the default deployment.

This approach lets you change the default deployment and change the model serving the deployment without changing the contract with the user invoking the endpoint. Use the following code to update the default deployment:
- Azure Machine Learning CLI
- Azure Machine Learning SDK for Python
```
az ml batch-endpoint update --name $ENDPOINT_NAME --set defaults.deployment_name=$DEPLOYMENT_NAME
```
```
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint)
```

Your batch endpoint is ready to be used.

Test the deployment

For testing the endpoint, use a sample of 1,000 images from the original ImageNet dataset. Batch endpoints can only process data that is located in the cloud and that is accessible from the Azure Machine Learning workspace. Upload it to an Azure Machine Learning data store. Create a data asset that can be used to invoke the endpoint for scoring.

Note

Batch endpoints accept data that can be placed in multiple types of locations.

Download the associated sample data.

Azure CLI
Python

wget https://azuremlexampledata.blob.core.chinacloudapi.cn/data/imagenet/imagenet-1000.zip
unzip imagenet-1000.zip -d data

Note

If you don't have wget installed locally, install it or use a browser to get the .zip file.

!wget https://azuremlexampledata.blob.core.chinacloudapi.cn/data/imagenet-1000.zip
!unzip imagenet-1000.zip -d data

Create the data asset from the data downloaded.

Azure CLI
Python

Create a data asset definition in a YAML file called imagenet-sample-unlabeled.yml:

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: imagenet-sample-unlabeled
description: A sample of 1000 images from the original ImageNet dataset. Download content from https://azuremlexampledata.blob.core.chinacloudapi.cn/data/imagenet-1000.zip.
type: uri_folder
path: data

Create the data asset.

az ml data create -f imagenet-sample-unlabeled.yml

Specify these values:

data_path = "data"
dataset_name = "imagenet-sample-unlabeled"

imagenet_sample = Data(
    path=data_path,
    type=AssetTypes.URI_FOLDER,
    description="A sample of 1000 images from the original ImageNet dataset",
    name=dataset_name,
)

Create the data asset.

ml_client.data.create_or_update(imagenet_sample)

To get the newly created data asset, use this code:

imagenet_sample = ml_client.data.get(dataset_name, label="latest")

When the data is uploaded and ready to be used, invoke the endpoint.
- Azure CLI
- Python
```
JOB_NAME = $(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input azureml:imagenet-sample-unlabeled@latest | jq -r '.name')
```
Note

If the utility jq isn't installed, see Download jq.
Tip

What's the difference between the inputs and input parameter when you invoke an endpoint?

In general, you can use a dictionary inputs = {} parameter with the invoke method to provide an arbitrary number of required inputs to a batch endpoint that contains a model deployment or a pipeline deployment.

For a model deployment, you can use the input parameter as a shorter way to specify the input data location for the deployment. This approach works because a model deployment always takes only one data input.
```
input = Input(type=AssetTypes.URI_FOLDER, path=imagenet_sample.id)
job = ml_client.batch_endpoints.invoke(
   endpoint_name=endpoint.name,
   input=input,
)
```

Tip

You don't indicate the deployment name in the invoke operation. That's because the endpoint automatically routes the job to the default deployment. Since the endpoint only has one deployment, that one is the default. You can target an specific deployment by indicating the argument/parameter deployment_name.

A batch job starts as soon as the command returns. You can monitor the status of the job until it finishes.
- Azure CLI
- Python
```
az ml job show --name $JOB_NAME
```
```
ml_client.jobs.get(job.name)
```

After the deployment finishes, download the predictions.

Azure CLI
Python

To download the predictions, use the following command:

az ml job download --name $JOB_NAME --output-name score --download-path ./

ml_client.jobs.download(name=job.name, output_name='score', download_path='./')

The predictions look like the following output. The predictions are combined with the labels for the convenience of the reader. To learn more about how to achieve this effect, see the associated notebook.

import pandas as pd
score = pd.read_csv("named-outputs/score/predictions.csv", header=None,  names=['file', 'class', 'probabilities'], sep=' ')
score['label'] = score['class'].apply(lambda pred: imagenet_labels[pred])
score

file	class	probabilities	label
n02088094_Afghan_hound.JPEG	161	0.994745	Afghan hound
n02088238_basset	162	0.999397	basset
n02088364_beagle.JPEG	165	0.366914	bluetick
n02088466_bloodhound.JPEG	164	0.926464	bloodhound
...	...	...	...

High throughput deployments

As mentioned before, the deployment processes one image a time, even when the batch deployment is providing a batch of them. In most cases, this approach is best. It simplifies how the models run and avoids any possible out-of-memory problems. However, in certain other cases, you might want to saturate as much as possible the underlying hardware. This situation is the case GPUs, for instance.

On those cases, you might want to do inference on the entire batch of data. That approach implies loading the entire set of images to memory and sending them directly to the model. The following example uses TensorFlow to read batch of images and score them all at once. It also uses TensorFlow ops to do any data preprocessing. The entire pipeline happens on the same device being used (CPU/GPU).

Warning

Some models have a non-linear relationship with the size of the inputs in terms of the memory consumption. To avoid out-of-memory exceptions, batch again (as done in this example) or decrease the size of the batches created by the batch deployment.

Create the scoring script code/score-by-batch/batch_driver.py:

import os
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import load_model


def init():
    global model
    global input_width
    global input_height

    # AZUREML_MODEL_DIR is an environment variable created during deployment
    model_path = os.path.join(os.environ["AZUREML_MODEL_DIR"], "model")

    # load the model
    model = load_model(model_path)
    input_width = 244
    input_height = 244


def decode_img(file_path):
    file = tf.io.read_file(file_path)
    img = tf.io.decode_jpeg(file, channels=3)
    img = tf.image.resize(img, [input_width, input_height])
    return img / 255.0


def run(mini_batch):
    images_ds = tf.data.Dataset.from_tensor_slices(mini_batch)
    images_ds = images_ds.map(decode_img).batch(64)

    # perform inference
    pred = model.predict(images_ds)

    # Compute probabilities, classes and labels
    pred_prob = tf.math.reduce_max(tf.math.softmax(pred, axis=-1)).numpy()
    pred_class = tf.math.argmax(pred, axis=-1).numpy()

    return pd.DataFrame(
        [mini_batch, pred_prob, pred_class], columns=["file", "probability", "class"]
    )

This script constructs a tensor dataset from the mini-batch sent by the batch deployment. This dataset is preprocessed to obtain the expected tensors for the model using the map operation with the function decode_img.
The dataset is batched again (16) to send the data to the model. Use this parameter to control how much information you can load into memory and send to the model at once. If running on a GPU, you need to carefully tune this parameter to achieve the maximum usage of the GPU just before getting an OOM exception.
After predictions are computed, the tensors are converted to numpy.ndarray.

Create the deployment.

Azure CLI
Python

To create a new deployment under the created endpoint, create a YAML configuration like the following example. For other properties, see the full batch endpoint YAML schema.

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: imagenet-classifier-batch
name: imagenet-classifier-resnetv2
description: A ResNetV2 model architecture for performing ImageNet classification in batch
type: model
model: azureml:imagenet-classifier@latest
compute: azureml:gpu-cluster
environment:
  name: tensorflow27-cuda11-gpu
  image: mcr.microsoft.com/azureml/curated/tensorflow-2.7-ubuntu20.04-py38-cuda11-gpu:latest
  conda_file: environment/conda.yaml
code_configuration:
  code: code/score-by-batch
  scoring_script: batch_driver.py
resources:
  instance_count: 2
tags:
  device_acceleration: CUDA
  device_batching: 16
settings:
  max_concurrency_per_instance: 1
  mini_batch_size: 5
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 300
  error_threshold: -1
  logging_level: info

Create the deployment with the following command:

az ml batch-deployment create --file deployment-by-batch.yml --endpoint-name $ENDPOINT_NAME --default

To create a new deployment with the indicated environment and scoring script, use the following code:

deployment = BatchDeployment(
    name="imagenet-classifier-resnetv2",
    description="A ResNetV2 model architecture for performing ImageNet classification in batch",
    endpoint_name=endpoint.name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="code/score-by-batch",
        scoring_script="batch_driver.py",
    ),
    compute=compute_name,
    instance_count=2,
    tags={ "device_acceleration": "CUDA", "device_batching": "16" }
    max_concurrency_per_instance=1,
    mini_batch_size=10,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)

Create the deployment with the following command:

ml_client.batch_deployments.begin_create_or_update(deployment)

You can use this new deployment with the sample data shown before. Remember that to invoke this deployment either indicate the name of the deployment in the invocation method or set it as the default one.

Considerations for MLflow models processing images

MLflow models in Batch Endpoints support reading images as input data. Since MLflow deployments don't require a scoring script, have the following considerations when using them:

Image files supported include: .png, .jpg, .jpeg, .tiff, .bmp, and .gif.
MLflow models should expect to receive a np.ndarray as input that matches the dimensions of the input image. In order to support multiple image sizes on each batch, the batch executor invokes the MLflow model once per image file.
MLflow models are highly encouraged to include a signature. If they do, it must be of type TensorSpec. Inputs are reshaped to match tensor's shape if available. If no signature is available, tensors of type np.uint8 are inferred.
For models that include a signature and are expected to handle variable size of images, include a signature that can guarantee it. For instance, the following signature example allows batches of 3 channeled images.

import numpy as np
import mlflow
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

input_schema = Schema([
  TensorSpec(np.dtype(np.uint8), (-1, -1, -1, 3)),
])
signature = ModelSignature(inputs=input_schema)

(...)

mlflow.<flavor>.log_model(..., signature=signature)

You can find a working example in the Jupyter notebook imagenet-classifier-mlflow.ipynb. For more information about how to use MLflow models in batch deployments, see Using MLflow models in batch deployments.

Image processing with batch model deployments

Prerequisites

Connect to your workspace

About this sample

Follow along in Jupyter Notebooks

Image classification with batch deployments

Create the endpoint

Register the model

Create a scoring script

Create the deployment

Test the deployment

High throughput deployments

Considerations for MLflow models processing images

Next steps

Additional resources