Tutorial: Power BI integration - Create the predictive model by using automated machine learning (part 1 of 2)

In part 1 of this tutorial, you train and deploy a predictive machine learning model. You use automated machine learning (ML) in Azure Machine Learning Studio. In part 2, you'll use the best-performing model to predict outcomes in Microsoft Power BI.

In this tutorial, you:

  • Create an Azure Machine Learning compute cluster.
  • Create a dataset.
  • Create an automated machine learning run.
  • Deploy the best model to a real-time scoring endpoint.

There are three ways to create and deploy the model you'll use in Power BI. This article covers "Option C: Train and deploy models by using automated machine learning in the studio." This option is a no-code authoring experience. It fully automates data preparation and model training.

But you could instead use one of the other options:

Prerequisites

Create a compute cluster

Automated machine learning trains many machine learning models to find the "best" algorithm and parameters. Azure Machine Learning parallelizes the running of the model training over a compute cluster.

To begin, in Azure Machine Learning Studio, in the menu on the left, select Compute. Open the Compute clusters tab. Then select New:

Screenshot showing how to create a compute cluster.

On the Create compute cluster page:

  1. Select a VM size. For this tutorial, a Standard_D11_v2 machine is fine.
  2. Select Next.
  3. Provide a valid compute name.
  4. Keep Minimum number of nodes at 0.
  5. Change Maximum number of nodes to 4.
  6. Select Create.

The status of your cluster changes to Creating.

Note

The new cluster has 0 nodes, so no compute costs are incurred. You incur costs only when the automated machine learning job runs. The cluster scales back to 0 automatically after 120 seconds of idle time.

Create a dataset

In this tutorial, you use the Diabetes dataset. This dataset is available in Azure Open Datasets.

To create the dataset, in the menu on the left, select Datasets. Then select Create dataset. You see the following options:

Screenshot showing how to create a new dataset.

Select From Open Datasets. Then on the Create dataset from Open Datasets page:

  1. Use the search bar to find diabetes.
  2. Select Sample: Diabetes.
  3. Select Next.
  4. Name your dataset diabetes.
  5. Select Create.

To explore the data, select the dataset and then select Explore:

Screenshot showing how to explore a dataset.

The data has 10 baseline input variables, such as age, sex, body mass index, average blood pressure, and six blood serum measurements. It also has one target variable, named Y. This target variable is a quantitative measure of diabetes progression one year after the baseline.

Create an automated machine learning run

In Azure Machine Learning Studio, in the menu on the left, select Automated ML. Then select New Automated ML run:

Screenshot showing how to create a new automated machine learning run.

Next, select the diabetes dataset you created earlier. Then select Next:

Screenshot showing how to select a dataset.

On the Configure run page:

  1. Under Experiment name, select Create new.
  2. Name the experiment.
  3. In the Target column field, select Y.
  4. In the Select compute cluster field, select the compute cluster you created earlier.

Your completed form should look like this:

Screenshot showing how to configure automated machine learning.

Finally, select a machine learning task. In this case, the task is Regression:

Screenshot showing how to configure a task.

Select Finish.

Important

Automated machine learning takes around 30 minutes to finish training the 100 models.

Deploy the best model

When automated machine learning finishes, you can see all the machine learning models that have been tried by selecting the Models tab. The models are ordered by performance; the best-performing model is shown first. After you select the best model, the Deploy button is enabled:

Screenshot showing the list of models.

Select Deploy to open a Deploy a model window:

  1. Name your model service diabetes-model.
  2. Select Azure Container Service.
  3. Select Deploy.

You should see a message that states that the model was deployed successfully.

Next steps

In this tutorial, you saw how to train and deploy a machine learning model by using automated machine learning. In the next tutorial, you'll learn how to consume (score) this model in Power BI.