Progressive rollout of MLflow models to Online Endpoints

In this article, you learn how you can progressively update and deploy MLflow models to Online Endpoints without causing service disruption. You use blue-green deployment, also known as a safe rollout strategy, to introduce a new version of a web service to production. This strategy will allow you to roll out your new version of the web service to a small subset of users or requests before rolling it out completely.

About this example

Online Endpoints have the concept of Endpoint and Deployment. An endpoint represents the API that customers use to consume the model, while the deployment indicates the specific implementation of that API. This distinction allows users to decouple the API from the implementation and to change the underlying implementation without affecting the consumer. This example will use such concepts to update the deployed model in endpoints without introducing service disruption.

The model we will deploy is based on the UCI Heart Disease Data Set. The database contains 76 attributes, but we are using a subset of 14 of them. The model tries to predict the presence of heart disease in a patient. It is integer valued from 0 (no presence) to 1 (presence). It has been trained using an XGBBoost classifier and all the required preprocessing has been packaged as a scikit-learn pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.

The information in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste files, clone the repo, and then change directories to sdk/using-mlflow/deploy.

Follow along in Jupyter Notebooks

You can follow along this sample in the following notebooks. In the cloned repository, open the notebook: mlflow_sdk_online_endpoints_progresive.ipynb.

Prerequisites

Before following the steps in this article, make sure you have the following prerequisites:

  • An Azure subscription. If you don't have an Azure subscription, create a trial subscription before you begin. Try the trial subscription.
  • Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the owner or contributor role for the Azure Machine Learning workspace, or a custom role allowing Microsoft.MachineLearningServices/workspaces/onlineEndpoints/*. For more information, see Manage access to an Azure Machine Learning workspace.

Additionally, you will need to:

Connect to your workspace

First, let's connect to Azure Machine Learning workspace where we are going to work on.

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Registering the model in the registry

Ensure your model is registered in Azure Machine Learning registry. Deployment of unregistered models is not supported in Azure Machine Learning. You can register a new model using the MLflow SDK:

MODEL_NAME='heart-classifier'
az ml model create --name $MODEL_NAME --type "mlflow_model" --path "model"

Create an online endpoint

Online endpoints are endpoints that are used for online (real-time) inferencing. Online endpoints contain deployments that are ready to receive data from clients and can send responses back in real time.

We are going to exploit this functionality by deploying multiple versions of the same model under the same endpoint. However, the new deployment will receive 0% of the traffic at the begging. Once we are sure about the new model to work correctly, we are going to progressively move traffic from one deployment to the other.

  1. Endpoints require a name, which needs to be unique in the same region. Let's ensure to create one that doesn't exist:

    ENDPOINT_SUFIX=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w ${1:-5} | head -n 1)
    ENDPOINT_NAME="heart-classifier-$ENDPOINT_SUFIX"
    
  2. Configure the endpoint

    endpoint.yml

    $schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
    name: heart-classifier-edp
    auth_mode: key
    
  3. Create the endpoint:

    az ml online-endpoint create -n $ENDPOINT_NAME -f endpoint.yml
    
  4. Getting the authentication secret for the endpoint.

    ENDPOINT_SECRET_KEY=$(az ml online-endpoint get-credentials -n $ENDPOINT_NAME | jq -r ".accessToken")
    

Create a blue deployment

So far, the endpoint is empty. There are no deployments on it. Let's create the first one by deploying the same model we were working on before. We will call this deployment "default", representing our "blue deployment".

  1. Configure the deployment

    blue-deployment.yml

    $schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
    name: default
    endpoint_name: heart-classifier-edp
    model: azureml:heart-classifier@latest
    instance_type: Standard_DS2_v2
    instance_count: 1
    
  2. Create the deployment

    az ml online-deployment create --endpoint-name $ENDPOINT_NAME -f blue-deployment.yml --all-traffic
    

    If your endpoint doesn't have egress connectivity, use model packaging (preview) by including the flag --with-package:

    az ml online-deployment create --with-package --endpoint-name $ENDPOINT_NAME -f blue-deployment.yml --all-traffic
    

    Tip

    We set the flag --all-traffic in the create command, which will assign all the traffic to the new deployment.

  3. Assign all the traffic to the deployment

    So far, the endpoint has one deployment, but none of its traffic is assigned to it. Let's assign it.

    This step in not required in the Azure CLI since we used the --all-traffic during creation.

  4. Update the endpoint configuration:

    This step in not required in the Azure CLI since we used the --all-traffic during creation.

  5. Create a sample input to test the deployment

    sample.yml

    {
        "input_data": {
            "columns": [
                "age",
                "sex",
                "cp",
                "trestbps",
                "chol",
                "fbs",
                "restecg",
                "thalach",
                "exang",
                "oldpeak",
                "slope",
                "ca",
                "thal"
            ],
            "data": [
                [ 48, 0, 3, 130, 275, 0, 0, 139, 0, 0.2, 1, 0, "normal" ]
            ]
        }
    }
    
  6. Test the deployment

    az ml online-endpoint invoke --name $ENDPOINT_NAME --request-file sample.json
    

Create a green deployment under the endpoint

Let's imagine that there is a new version of the model created by the development team and it is ready to be in production. We can first try to fly this model and once we are confident, we can update the endpoint to route the traffic to it.

  1. Register a new model version

    MODEL_NAME='heart-classifier'
    az ml model create --name $MODEL_NAME --type "mlflow_model" --path "model"
    

    Let's get the version number of the new model:

    VERSION=$(az ml model show -n heart-classifier --label latest | jq -r ".version")
    
  2. Configure a new deployment

    green-deployment.yml

    $schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
    name: xgboost-model
    endpoint_name: heart-classifier-edp
    model: azureml:heart-classifier@latest
    instance_type: Standard_DS2_v2
    instance_count: 1
    

    We will name the deployment as follows:

    GREEN_DEPLOYMENT_NAME="xgboost-model-$VERSION"
    
  3. Create the new deployment

    az ml online-deployment create -n $GREEN_DEPLOYMENT_NAME --endpoint-name $ENDPOINT_NAME -f green-deployment.yml
    

    If your endpoint doesn't have egress connectivity, use model packaging (preview) by including the flag --with-package:

    az ml online-deployment create --with-package -n $GREEN_DEPLOYMENT_NAME --endpoint-name $ENDPOINT_NAME -f green-deployment.yml
    
  4. Test the deployment without changing traffic

    az ml online-endpoint invoke --name $ENDPOINT_NAME --deployment-name $GREEN_DEPLOYMENT_NAME --request-file sample.json
    

    Tip

    Notice how now we are indicating the name of the deployment we want to invoke.

Progressively update the traffic

One we are confident with the new deployment, we can update the traffic to route some of it to the new deployment. Traffic is configured at the endpoint level:

  1. Configure the traffic:

    This step in not required in the Azure CLI

  2. Update the endpoint

    az ml online-endpoint update --name $ENDPOINT_NAME --traffic "default=90 $GREEN_DEPLOYMENT_NAME=10"
    
  3. If you decide to switch the entire traffic to the new deployment, update all the traffic:

    This step in not required in the Azure CLI

  4. Update the endpoint

    az ml online-endpoint update --name $ENDPOINT_NAME --traffic "default=0 $GREEN_DEPLOYMENT_NAME=100"
    
  5. Since the old deployment doesn't receive any traffic, you can safely delete it:

    az ml online-deployment delete --endpoint-name $ENDPOINT_NAME --name default
    

    Tip

    Notice that at this point, the former "blue deployment" has been deleted and the new "green deployment" has taken the place of the "blue deployment".

Clean-up resources

az ml online-endpoint delete --name $ENDPOINT_NAME --yes

Important

Notice that deleting an endpoint also deletes all the deployments under it.

Next steps