在 Azure 机器学习工作室(经典版)中解释模型结果Interpret model results in Azure Machine Learning Studio (classic)

适用于: yes机器学习工作室(经典) noAzure 机器学习APPLIES TO: yesMachine Learning Studio (classic) noAzure Machine Learning

本主题说明如何在 Azure 机器学习工作室(经典版)中可视化和解释预测结果。This topic explains how to visualize and interpret prediction results in Azure Machine Learning Studio (classic). 训练模型并根据它进行预测(“为模型评分”)后,需要了解并解释预测结果。After you have trained a model and done predictions on top of it ("scored the model"), you need to understand and interpret the prediction result.

Azure 机器学习工作室(经典版)中有四种主要的机器学习模型:There are four major kinds of machine learning models in Azure Machine Learning Studio (classic):

  • 分类Classification
  • 群集功能Clustering
  • 回归Regression
  • 推荐器系统Recommender systems

用于基于这些模型进行预测的模块包括:The modules used for prediction on top of these models are:

了解如何选择参数优化机器学习工作室(经典)中的算法Learn how to choose parameters to optimize your algorithms in ML Studio (classic).

若要了解如何评估模型,请参阅如何评估模型性能To learn how to evaluate your models, see How to evaluate model performance.

如果你不熟悉机器学习工作室(经典),请参阅了解如何创建简单的试验If you are new to ML Studio (classic), learn how to create a simple experiment.

分类Classification

分类问题有两个子类别:There are two subcategories of classification problems:

  • 仅两个类的问题(双类或二元分类)Problems with only two classes (two-class or binary classification)
  • 多于两个类的问题(多类分类)Problems with more than two classes (multi-class classification)

Azure 机器学习工作室(经典版)具有不同的模块用于处理其中每一种分类,但用于解释其预测结果的方法都相似。Azure Machine Learning Studio (classic) has different modules to deal with each of these types of classification, but the methods for interpreting their prediction results are similar.

双类分类Two-class classification

示例实验Example experiment

双类分类问题的一个示例是鸢尾花的分类。An example of a two-class classification problem is the classification of iris flowers. 任务是根据特征为鸢尾花分类。The task is to classify iris flowers based on their features. Azure 机器学习工作室(经典版)中提供的鸢尾花数据集是流行的鸢尾花数据集的子集,仅包含两种花卉种类(类 0 和类 1)的实例。The Iris data set provided in Azure Machine Learning Studio (classic) is a subset of the popular Iris data set containing instances of only two flower species (classes 0 and 1). 每个花卉有四个特征(花萼长度、花萼宽度、花瓣长度和花瓣宽度)。There are four features for each flower (sepal length, sepal width, petal length, and petal width).

鸢尾花实验的屏幕截图

图 1.Figure 1. 鸢尾花双类分类问题实验Iris two-class classification problem experiment

已执行实验来解决此问题,如图 1 所示。An experiment has been performed to solve this problem, as shown in Figure 1. 已训练并评分双类提升决策树模型。A two-class boosted decision tree model has been trained and scored. 现在,可以可视化评分模型模块的预测结果,方法是单击评分模型模块的输出端口,并单击“可视化”。Now you can visualize the prediction results from the Score Model module by clicking the output port of the Score Model module and then clicking Visualize.

评分模型模块

这会打开评分结果,如图 2 所示。This brings up the scoring results as shown in Figure 2.

鸢尾花双类分类实验的结果

图 2.Figure 2. 在双类分类中可视化评分模型结果Visualize a score model result in two-class classification

结果解释Result interpretation

结果表中有六列。There are six columns in the results table. 左边的四列是四个特征。The left four columns are the four features. 右边的两列“评分标签”和“评分概率”是预测结果。The right two columns, Scored Labels and Scored Probabilities, are the prediction results. “评分概率”列显示花卉属于正类(类 1)的概率。The Scored Probabilities column shows the probability that a flower belongs to the positive class (Class 1). 例如,列中的第一个数字 (0.028571) 表示第一个花卉属于类 1 的概率为 0.028571。For example, the first number in the column (0.028571) means there is 0.028571 probability that the first flower belongs to Class 1. “评分标签”列显示每个花卉的预测类。The Scored Labels column shows the predicted class for each flower. 这基于“评分概率”列。This is based on the Scored Probabilities column. 如果花卉的评分概率大于 0.5,则它预测为类 1。If the scored probability of a flower is larger than 0.5, it is predicted as Class 1. 否则,它预测为类 0。Otherwise, it is predicted as Class 0.

Web 服务发布Web service publication

在了解了预测结果并判断其为优良后,可以将实验发布为 Web 服务,以便可以将其部署在各种应用程序中,并对任何新的鸢尾花调用它以获取类预测。After the prediction results have been understood and judged sound, the experiment can be published as a web service so that you can deploy it in various applications and call it to obtain class predictions on any new iris flower. 若要了解如何将训练实验更改为评分实验并将其作为 Web 服务发布,请参阅教程 3:部署信用风险模型To learn how to change a training experiment into a scoring experiment and publish it as a web service, see Tutorial 3: Deploy credit risk model. 此过程提供评分实验,如图 3 所示。This procedure provides you with a scoring experiment as shown in Figure 3.

评分实验的屏幕截图

图 3.Figure 3. 为鸢尾花双类分类问题实验评分Scoring the iris two-class classification problem experiment

现在需要设置 Web 服务的输入和输出。Now you need to set the input and output for the web service. 输入是评分模型的右输入端口,即鸢尾花特征输入。The input is the right input port of Score Model, which is the Iris flower features input. 输出的选择取决于对预测的类(评分标签)、评分概率还是两者都感兴趣。The choice of the output depends on whether you are interested in the predicted class (scored label), the scored probability, or both. 在本示例中,假设用户对两者都感兴趣。In this example, it is assumed that you are interested in both. 若要选择所需的输出列,请使用选择数据集中的列模块。To select the desired output columns, use a Select Columns in Data set module. 单击选择数据集中的列,单击“启动列选择器”,并选择“评分标签”和“评分概率” 。Click Select Columns in Data set, click Launch column selector, and select Scored Labels and Scored Probabilities. 设置选择数据集中的列的输出端口并再次运行它之后,应准备好通过单击“发布 Web 服务”将评分实验发布为 Web 服务。After setting the output port of Select Columns in Data set and running it again, you should be ready to publish the scoring experiment as a web service by clicking PUBLISH WEB SERVICE. 最终实验类似于图 4。The final experiment looks like Figure 4.

鸢尾花双类分类实验

图 4。Figure 4. 鸢尾花双类分类问题的最终评分实验Final scoring experiment of an iris two-class classification problem

运行 Web 服务并输入测试实例的某些特征值后,结果返回两个数字。After you run the web service and enter some feature values of a test instance, the result returns two numbers. 第一个数字是评分标签,第二个数字是评分概率。The first number is the scored label, and the second is the scored probability. 此花卉预测为类 1,概率为 0.9655。This flower is predicted as Class 1 with 0.9655 probability.

测试解释评分模型

评分测试结果

图 5。Figure 5. 鸢尾花双类分类的 Web 服务结果Web service result of iris two-class classification

多类分类Multi-class classification

示例实验Example experiment

在此实验中,执行字母识别任务作为多类分类的示例。In this experiment, you perform a letter-recognition task as an example of multiclass classification. 分类器尝试基于某些从手写图像中提取的手写属性值预测特定字母(类)。The classifier attempts to predict a certain letter (class) based on some hand-written attribute values extracted from the hand-written images.

字母识别示例

在训练数据中,有 16 个从手写字母图像中提取的特征。In the training data, there are 16 features extracted from hand-written letter images. 26 个字母形成了 26 个类。The 26 letters form our 26 classes. 图 6 显示一个实验,该实验将训练多类分类模型进行字母识别,并对测试数据集上的相同特征集进行预测。Figure 6 shows an experiment that will train a multiclass classification model for letter recognition and predict on the same feature set on a test data set.

字母识别多类分类实验

图 6。Figure 6. 字母识别多类分类实验问题Letter recognition multiclass classification problem experiment

可视化评分模型模块的结果,方法是单击评分模型模块的输出端口,并单击“可视化”,应看到如图 7 所示的内容。Visualizing the results from the Score Model module by clicking the output port of Score Model module and then clicking Visualize, you should see content as shown in Figure 7.

评分模型结果

图 7。Figure 7. 可视化多类分类中的评分模型结果Visualize score model results in a multi-class classification

结果解释Result interpretation

左边的 16 个列表示测试集的特征值。The left 16 columns represent the feature values of the test set. 名称为“类“XX”的评分概率”之类的列类似于双类案例中的“评分概率”列。The columns with names like Scored Probabilities for Class "XX" are just like the Scored Probabilities column in the two-class case. 它们显示对应的项归入特定类的概率。They show the probability that the corresponding entry falls into a certain class. 例如,对于第一个项,它是“A”的概率为 0.003571,它是“B”的概率为 0.000451,以此类推。For example, for the first entry, there is 0.003571 probability that it is an "A," 0.000451 probability that it is a "B," and so forth. 最后一列(评分标签)与双类案例中的评分标签相同。The last column (Scored Labels) is the same as Scored Labels in the two-class case. 它选择具有最大评分概率的类作为对应项的预测类。It selects the class with the largest scored probability as the predicted class of the corresponding entry. 例如,对于第一个项,评分标签为“F”,因为它是“F”的概率 (0.916995) 最大。For example, for the first entry, the scored label is "F" since it has the largest probability to be an "F" (0.916995).

Web 服务发布Web service publication

还可获取每个项的评分标签和评分标签的概率。You can also get the scored label for each entry and the probability of the scored label. 基本逻辑是查找所有评分概率中最大的概率。The basic logic is to find the largest probability among all the scored probabilities. 若要执行此操作,需要使用执行 R 脚本模块。To do this, you need to use the Execute R Script module. R 代码显示在图 8 中,实验结果显示在图 9 中。The R code is shown in Figure 8, and the result of the experiment is shown in Figure 9.

R 代码示例

图 8。Figure 8. 用于提取评分标签及标签的关联概率的 R 代码R code for extracting Scored Labels and the associated probabilities of the labels

实验结果

图 9.Figure 9. 字母识别多类分类问题的最终评分实验Final scoring experiment of the letter-recognition multiclass classification problem

发布和运行 Web 服务并输入某些输入特征值后,返回的结果类似于图 10。After you publish and run the web service and enter some input feature values, the returned result looks like Figure 10. 此手写字母及其提取的 16 个特征预测为“T”,概率为 0.9715。This hand-written letter, with its extracted 16 features, is predicted to be a "T" with 0.9715 probability.

测试解释评分模块

测试结果

图 10.Figure 10. 多类分类的 Web 服务结果Web service result of multiclass classification

回归Regression

回归问题不同于分类问题。Regression problems are different from classification problems. 在分类问题中,将尝试预测离散类,如鸢尾花所属的类。In a classification problem, you're trying to predict discrete classes, such as which class an iris flower belongs to. 但是,正如以下回归问题示例所示,将尝试预测连续变量,如一辆汽车的价格。But as you can see in the following example of a regression problem, you're trying to predict a continuous variable, such as the price of a car.

示例实验Example experiment

使用汽车价格预测作为回归的示例。Use automobile price prediction as your example for regression. 将尝试根据特征预测汽车的价格,包括型号、燃料类型、车身类型和驱动轮。You are trying to predict the price of a car based on its features, including make, fuel type, body type, and drive wheel. 实验显示在图 11 中。The experiment is shown in Figure 11.

汽车价格回归实验

图 11.Figure 11. 汽车价格回归问题实验Automobile price regression problem experiment

可视化评分模型模块,结果类似于图 12。Visualizing the Score Model module, the result looks like Figure 12.

汽车价格预测问题的评分结果

图 12.Figure 12. 汽车价格预测问题的评分结果Scoring result for the automobile price prediction problem

结果解释Result interpretation

评分标签是此评分结果中的结果列。Scored Labels is the result column in this scoring result. 数字是每辆车的预测价格。The numbers are the predicted price for each car.

Web 服务发布Web service publication

可将回归实验发布到 Web 服务中,并调用它进行汽车价格预测,与双类分类用例方法相同。You can publish the regression experiment into a web service and call it for automobile price prediction in the same way as in the two-class classification use case.

汽车价格回归问题的评分实验

图 13.Figure 13. 汽车价格回归问题的评分实验Scoring experiment of an automobile price regression problem

运行 Web 服务,返回的结果类似于图 14。Running the web service, the returned result looks like Figure 14. 此汽车的预测价格为 15085.52 美元。The predicted price for this car is $15,085.52.

测试解释评分模块

评分模块结果

图 14.Figure 14. 汽车价格回归问题的 Web 服务结果Web service result of an automobile price regression problem

群集功能Clustering

示例实验Example experiment

让我们再次使用鸢尾花构建聚类实验。Let’s use the Iris data set again to build a clustering experiment. 可在此处筛选出数据集中的类标签,以便它仅具有特征,并且可用于聚类。Here you can filter out the class labels in the data set so that it only has features and can be used for clustering. 在此鸢尾花用例中,在训练过程中将群集的数量指定为二,这意味着将花卉聚类为两个类。In this iris use case, specify the number of clusters to be two during the training process, which means you would cluster the flowers into two classes. 实验显示在图 15 中。The experiment is shown in Figure 15.

鸢尾花聚类问题实验

图 15.Figure 15. 鸢尾花聚类问题实验Iris clustering problem experiment

聚类不同于分类,因为训练数据集本身没有地面实况标签。Clustering differs from classification in that the training data set doesn’t have ground-truth labels by itself. 聚类将训练数据集实例分组为离散群集。Clustering groups the training data set instances into distinct clusters. 在训练过程中,模型通过了解项特征之间的差异标记项。During the training process, the model labels the entries by learning the differences between their features. 在那之后,训练的模型可用于进一步分类将来的项。After that, the trained model can be used to further classify future entries. 在聚类问题中,我们对结果的两个部分感兴趣。There are two parts of the result we are interested in within a clustering problem. 第一个部分是训练数据集,第二个部分是使用训练的模型为新数据集分类。The first part is labeling the training data set, and the second is classifying a new data set with the trained model.

结果的第一个部分可以可视化,方法是单击聚类分析模型定型的左输出端口,并单击“可视化”。The first part of the result can be visualized by clicking the left output port of Train Clustering Model and then clicking Visualize. 可视化显示在图 16 中。The visualization is shown in Figure 16.

聚类结果

图 16.Figure 16. 可视化训练数据集的聚类结果Visualize clustering result for the training data set

第二个部分(使用训练的聚类模型聚类新项)显示在图 17 中。The result of the second part, clustering new entries with the trained clustering model, is shown in Figure 17.

可视化聚类结果

图 17.Figure 17. 可视化新数据集的聚类结果Visualize clustering result on a new data set

结果解释Result interpretation

尽管两个部分的结果源于不同实验阶段,但它们看起来相同,并且以相同方式解释。Although the results of the two parts stem from different experiment stages, they look the same and are interpreted in the same way. 前四列是功能。The first four columns are features. 最后一列“分配”是预测结果。The last column, Assignments, is the prediction result. 分配有相同数字的项预测为在同一个群集中,即,它们在某些方面具有相似性(此实验使用默认的欧几里德距离度量)。The entries assigned the same number are predicted to be in the same cluster, that is, they share similarities in some way (this experiment uses the default Euclidean distance metric). 由于已将群集的数量指定为 2,因此“分配”中的项标记为 0 或 1。Because you specified the number of clusters to be 2, the entries in Assignments are labeled either 0 or 1.

Web 服务发布Web service publication

可将聚类实验发布到 Web 服务中,并调用它进行聚类预测,与双类分类用例方法相同。You can publish the clustering experiment into a web service and call it for clustering predictions the same way as in the two-class classification use case.

鸢尾花聚类问题的评分实验

图 18.Figure 18. 鸢尾花聚类问题的评分实验Scoring experiment of an iris clustering problem

运行 Web 服务后,返回的结果类似于图 19。After you run the web service, the returned result looks like Figure 19. 此花卉预测为在群集 0 中。This flower is predicted to be in cluster 0.

测试解释评分模块

评分模块结果

图 19.Figure 19. 鸢尾花双类分类的 Web 服务结果Web service result of iris two-class classification

推荐器系统Recommender system

示例实验Example experiment

对于推荐器系统,可使用餐厅推荐问题作为示例:可基于评级历史记录为客户推荐餐厅。For recommender systems, you can use the restaurant recommendation problem as an example: you can recommend restaurants for customers based on their rating history. 输入数据由三部分组成:The input data consists of three parts:

  • 来自客户的餐厅评级Restaurant ratings from customers
  • 客户特征数据Customer feature data
  • 餐馆特色数据Restaurant feature data

使用 Azure 机器学习工作室(经典版)中的 Matchbox 推荐器定型模块,可以执行多个操作:There are several things we can do with the Train Matchbox Recommender module in Azure Machine Learning Studio (classic):

  • 预测给定用户和项目的评级Predict ratings for a given user and item
  • 向给定用户推荐项目Recommend items to a given user
  • 查找与给定用户相关的用户Find users related to a given user
  • 查找与给定项目相关的项目Find items related to a given item

通过从“推荐器预测类型”菜单中的四种选项中选择,可选择要执行的操作。You can choose what you want to do by selecting from the four options in the Recommender prediction kind menu. 下面演练全部四种方案。Here you can walk through all four scenarios.

Matchbox 推荐器

推荐器系统的典型 Azure 机器学习工作室(经典版)实验类似于图 20。A typical Azure Machine Learning Studio (classic) experiment for a recommender system looks like Figure 20. 有关如何使用这些推荐器系统模块的信息,请参阅训练 Matchbox 推荐器Matchbox 推荐器评分For information about how to use those recommender system modules, see Train matchbox recommender and Score matchbox recommender.

推荐器系统实验

图 20.Figure 20. 推荐器系统实验Recommender system experiment

结果解释Result interpretation

预测给定用户和项目的评级Predict ratings for a given user and item

选择“推荐器预测类型”下的“评分预测”即要求推荐器系统预测给定用户和项目的评级。By selecting Rating Prediction under Recommender prediction kind, you are asking the recommender system to predict the rating for a given user and item. Matchbox 推荐器评分输出的可视化类似于图 21。The visualization of the Score Matchbox Recommender output looks like Figure 21.

推荐器系统的评分结果 - 评级预测

图 21.Figure 21. 可视化推荐器系统的评分结果 - 评级预测Visualize the score result of the recommender system--rating prediction

前两列是输入数据提供的用户项目对。The first two columns are the user-item pairs provided by the input data. 第三列是用户对特定项目的预测评级。The third column is the predicted rating of a user for a certain item. 例如,在第一行中,预测客户 U1048 将餐厅 135026 评级为 2。For example, in the first row, customer U1048 is predicted to rate restaurant 135026 as 2.

向给定用户推荐项目Recommend items to a given user

选择“推荐器预测类型”下的“项目推荐”即要求推荐器系统向给定用户推荐项目。By selecting Item Recommendation under Recommender prediction kind, you're asking the recommender system to recommend items to a given user. 此方案中要选择的最后一个参数是推荐项目选择The last parameter to choose in this scenario is Recommended item selection. 选项“从评级项目(用于模型评估)”主要用于训练过程中的模型评估。The option From Rated Items (for model evaluation) is primarily for model evaluation during the training process. 对于此预测阶段,我们选择“从所有项目”。For this prediction stage, we choose From All Items. Matchbox 推荐器评分输出的可视化类似于图 22。The visualization of the Score Matchbox Recommender output looks like Figure 22.

推荐器系统的评分结果 - 项目推荐

图 22.Figure 22. 可视化推荐器系统的评分结果 - 项目推荐Visualize score result of the recommender system--item recommendation

六列中的第一列表示向要为其推荐项目的给定用户 ID,由输入数据提供。The first of the six columns represents the given user IDs to recommend items for, as provided by the input data. 其他五列表示向该用户推荐的项目,以相关度降序排序。The other five columns represent the items recommended to the user in descending order of relevance. 例如,在第一行中,对客户 U1048 最推荐的餐厅为 134986,然后依次为 135018、134975、135021 和 132862。For example, in the first row, the most recommended restaurant for customer U1048 is 134986, followed by 135018, 134975, 135021, and 132862.

查找与给定用户相关的用户Find users related to a given user

选择“推荐器预测类型”下的“相关用户”即要求推荐器系统查找给定用户的相关用户。By selecting Related Users under Recommender prediction kind, you're asking the recommender system to find related users to a given user. 相关用户是具有相似偏好的用户。Related users are the users who have similar preferences. 此方案中要选择的最后一个参数是相关用户选择The last parameter to choose in this scenario is Related user selection. 选项“从已为项目评级的用户(用于模型评估)”主要用于训练过程中的模型评估。The option From Users That Rated Items (for model evaluation) is primarily for model evaluation during the training process. 对于此预测阶段,选择“从所有用户”。Choose From All Users for this prediction stage. Matchbox 推荐器评分输出的可视化类似于图 23。The visualization of the Score Matchbox Recommender output looks like Figure 23.

推荐器系统的评分结果 --相关用户

图 23.Figure 23. 可视化推荐器系统的评分结果--相关用户Visualize score results of the recommender system--related users

六列中的第一列显示查找相关用户所需的给定用户 ID,由输入数据提供。The first of the six columns shows the given user IDs needed to find related users, as provided by input data. 其他五列存储该用户的预测相关用户,以相关度降序排序。The other five columns store the predicted related users of the user in descending order of relevance. 例如,在第一行中,客户 U1048 最相关的客户为 U1051,然后依次为 U1066、U1044、U1017 和 U1072。For example, in the first row, the most relevant customer for customer U1048 is U1051, followed by U1066, U1044, U1017, and U1072.

查找与给定项目相关的项目Find items related to a given item

选择“推荐器预测类型”下的“相关项目”即要求推荐器系统查找给定项目的相关项目。By selecting Related Items under Recommender prediction kind, you are asking the recommender system to find related items to a given item. 相关项目是同一个用户最有可能喜欢的项目。Related items are the items most likely to be liked by the same user. 此方案中要选择的最后一个参数是相关项目选择The last parameter to choose in this scenario is Related item selection. 选项“从评级项目(用于模型评估)”主要用于训练过程中的模型评估。The option From Rated Items (for model evaluation) is primarily for model evaluation during the training process. 对于此预测阶段,我们选择“从所有项目”。We choose From All Items for this prediction stage. Matchbox 推荐器评分输出的可视化类似于图 24。The visualization of the Score Matchbox Recommender output looks like Figure 24.

推荐器系统的评分结果 --相关项目

图 24.Figure 24. 可视化推荐器系统的评分结果--相关项目Visualize score results of the recommender system--related items

六列中的第一列表示查找相关项目所需的给定项目 ID,由输入数据提供。The first of the six columns represents the given item IDs needed to find related items, as provided by the input data. 其他五列存储该项目的预测相关项目,以相关度降序排序。The other five columns store the predicted related items of the item in descending order in terms of relevance. 例如,在第一行中,项目 135026 最相关的项目为 135074,然后依次为 135035、132875、135055 和 134992。For example, in the first row, the most relevant item for item 135026 is 135074, followed by 135035, 132875, 135055, and 134992.

Web 服务发布Web service publication

对于这四个方案中的每一个,将这些实验发布为 Web 服务以获取预测的过程都类似。The process of publishing these experiments as web services to get predictions is similar for each of the four scenarios. 此处我们以第二个方案(向给定用户的推荐项目)为例。Here we take the second scenario (recommend items to a given user) as an example. 对于其他三个方案,可遵循相同的过程。You can follow the same procedure with the other three.

将训练的推荐器系统保存为训练的模型并根据请求将输入数据筛选到单个用户 ID 列,可挂钩该实验(如图 25 所示)并将其发布为 Web 服务。Saving the trained recommender system as a trained model and filtering the input data to a single user ID column as requested, you can hook up the experiment as in Figure 25 and publish it as a web service.

餐厅推荐问题的评分实验

图 25.Figure 25. 餐厅推荐问题的评分实验Scoring experiment of the restaurant recommendation problem

运行 Web 服务,返回的结果类似于图 26。Running the web service, the returned result looks like Figure 26. 对用户 U1048 最推荐的五个餐厅为 134986、135018、134975、135021 和 132862。The five recommended restaurants for user U1048 are 134986, 135018, 134975, 135021, and 132862.

推荐器系统服务的示例

示例实验结果

图 26.Figure 26. 餐厅推荐问题的 Web 服务结果Web service result of restaurant recommendation problem