Use GitHub Actions with Azure Machine Learning
APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)
Get started with GitHub Actions to train a model on Azure Machine Learning.
This article will teach you how to create a GitHub Actions workflow that builds and deploys a machine learning model to Azure Machine Learning. You'll train a scikit-learn linear regression model on the NYC Taxi dataset.
GitHub Actions uses a workflow YAML (.yml) file in the /.github/workflows/
path in your repository. This definition contains the various steps and parameters that make up the workflow.
Prerequisites
Before following the steps in this article, make sure you have the following prerequisites:
An Azure Machine Learning workspace. If you don't have one, use the steps in the Quickstart: Create workspace resources article to create one.
To install the Python SDK v2, use the following command:
pip install azure-ai-ml azure-identity
To update an existing installation of the SDK to the latest version, use the following command:
pip install --upgrade azure-ai-ml azure-identity
For more information, see Install the Python SDK v2 for Azure Machine Learning.
- A GitHub account. If you don't have one, sign up for free.
Step 1: Get the code
Fork the following repo at GitHub:
https://github.com/azure/azureml-examples
Clone your forked repo locally.
git clone https://github.com/YOUR-USERNAME/azureml-examples
Step 2: Authenticate with Azure
You'll need to first define how to authenticate with Azure. You can use a service principal or OpenID Connect.
Generate deployment credentials
Create a service principal with the az ad sp create-for-rbac command in the Azure CLI.
az ad sp create-for-rbac --name "myML" --role contributor \
--scopes /subscriptions/<subscription-id>/resourceGroups/<group-name> \
--sdk-auth
In the example above, replace the placeholders with your subscription ID, resource group name, and app name. The output is a JSON object with the role assignment credentials that provide access to your App Service app similar to below. Copy this JSON object for later.
{
"clientId": "<GUID>",
"clientSecret": "<GUID>",
"subscriptionId": "<GUID>",
"tenantId": "<GUID>",
(...)
}
Create secrets
In GitHub, browse your repository, select Settings > Secrets > Actions. Select New repository secret.
Paste the entire JSON output from the Azure CLI command into the secret's value field. Give the secret the name
AZ_CREDS
.
Step 3: Update setup.sh
to connect to your Azure Machine Learning workspace
You'll need to update the CLI setup file variables to match your workspace.
In your forked repository, go to
azureml-examples/cli/
.Edit
setup.sh
and update these variables in the file.Variable Description GROUP Name of resource group LOCATION Location of your workspace (example: chinanorth2
)WORKSPACE Name of Azure Machine Learning workspace
Step 4: Update pipeline.yml
with your compute cluster name
You'll use a pipeline.yml
file to deploy your Azure Machine Learning pipeline. This is a machine learning pipeline and not a DevOps pipeline. You only need to make this update if you're using a name other than cpu-cluster
for your computer cluster name.
- In your forked repository, go to
azureml-examples/cli/jobs/pipelines/nyc-taxi/pipeline.yml
. - Each time you see
compute: azureml:cpu-cluster
, update the value ofcpu-cluster
with your compute cluster name. For example, if your cluster is namedmy-cluster
, your new value would beazureml:my-cluster
. There are five updates.
Step 5: Run your GitHub Actions workflow
Your workflow authenticates with Azure, sets up the Azure Machine Learning CLI, and uses the CLI to train a model in Azure Machine Learning.
Your workflow file is made up of a trigger section and jobs:
- A trigger starts the workflow in the
on
section. The workflow runs by default on a cron schedule and when a pull request is made from matching branches and paths. Learn more about events that trigger workflows. - In the jobs section of the workflow, you checkout code and log into Azure with your service principal secret.
- The jobs section also includes a setup action that installs and sets up the Machine Learning CLI (v2). Once the CLI is installed, the run job action runs your Azure Machine Learning
pipeline.yml
file to train a model with NYC taxi data.
Enable your workflow
In your forked repository, open
.github/workflows/cli-jobs-pipelines-nyc-taxi-pipeline.yml
and verify that your workflow looks like this.name: cli-jobs-pipelines-nyc-taxi-pipeline on: workflow_dispatch: schedule: - cron: "0 0/4 * * *" pull_request: branches: - main - sdk-preview paths: - cli/jobs/pipelines/nyc-taxi/** - .github/workflows/cli-jobs-pipelines-nyc-taxi-pipeline.yml - cli/run-pipeline-jobs.sh - cli/setup.sh jobs: build: runs-on: ubuntu-latest steps: - name: check out repo uses: actions/checkout@v2 - name: azure login uses: azure/login@v1 with: creds: ${{secrets.AZURE_CREDENTIALS}} - name: setup run: bash setup.sh working-directory: cli continue-on-error: true - name: run job run: bash -x ../../../run-job.sh pipeline.yml working-directory: cli/jobs/pipelines/nyc-taxi
Select View runs.
Enable workflows by selecting I understand my workflows, go ahead and enable them.
Select the cli-jobs-pipelines-nyc-taxi-pipeline workflow and choose to Enable workflow.
Select Run workflow and choose the option to Run workflow now.
Step 6: Verify your workflow run
Open your completed workflow run and verify that the build job ran successfully. You'll see a green checkmark next to the job.
Open Azure Machine Learning studio and navigate to the nyc-taxi-pipeline-example. Verify that each part of your job (prep, transform, train, predict, score) completed and that you see a green checkmark.
Clean up resources
When your resource group and repository are no longer needed, clean up the resources you deployed by deleting the resource group and your GitHub repository.