Azure 机器学习设计器(预览版)的算法和模块参考Algorithm & module reference for Azure Machine Learning designer (preview)

此参考内容提供有关 Azure 机器学习设计器(预览版)中可用的每个机器学习算法和模块的技术背景。This reference content provides the technical background on each of the machine learning algorithms and modules available in Azure Machine Learning designer (preview).

每个模块表示一组可以独立运行并可根据所需输入来执行机器学习任务的代码。Each module represents a set of code that can run independently and perform a machine learning task, given the required inputs. 模块可能包含特定的算法,或者可能执行在机器学习中非常重要的任务,如替换缺少的值或进行统计分析。A module might contain a particular algorithm, or perform a task that is important in machine learning, such as missing value replacement, or statistical analysis.

有关选择算法的帮助,请参阅For help with choosing algorithms, see

提示

在设计器的任何管道中,可以获取有关特定模块的信息。In any pipeline in the designer, you can get information about a specific module. 在模块列表中的模块上方悬停时,或是在模块的右窗格中,选择模块卡上的“了解更多” 链接。Select the Learn more link in the module card when hovering on the module in the module list, or in the right pane of the module.

数据准备模块Data preparation modules

功能Functionality 说明Description 模块Module
数据输入和输出Data Input and Output 将数据从云源移动到管道中。Move data from cloud sources into your pipeline. 在运行管道时将结果或中间数据写入到 Azure 存储、SQL 数据库或 Hive,或者使用云存储空间在管道之间交换数据。Write your results or intermediate data to Azure Storage, a SQL database, or Hive, while running a pipeline, or use cloud storage to exchange data between pipelines. 手动输入数据Enter Data Manually
导出数据Export Data
导入数据Import Data
数据转换Data Transformation 对数据进行的机器学习独有的操作,例如将数据规范化或装箱、维数缩减以及在各种文件格式间转换数据。Operations on data that are unique to machine learning, such as normalizing or binning data, dimensionality reduction, and converting data among various file formats. 添加列Add Columns
添加行Add Rows
应用数学运算Apply Math Operation
应用 SQL 转换Apply SQL Transformation
清理缺失数据Clean Missing Data
剪切值Clip Values
转换为 CSVConvert to CSV
转换为数据集Convert to Dataset
转换为指示器值Convert to Indicator Values
编辑元数据Edit Metadata
将数据分组到箱中Group Data into Bins
联接数据Join Data
规范化数据Normalize Data
分区和采样Partition and Sample
删除重复的行Remove Duplicate Rows
SMOTESMOTE
选择列转换Select Columns Transform
在数据集中选择列Select Columns in Dataset
拆分数据Split Data
特征选择Feature Selection 选择用于构建分析模型的相关有用特征的子集。Select a subset of relevant, useful features to use in building an analytical model. 基于筛选器的特征选择Filter Based Feature Selection
排列特征重要性Permutation Feature Importance
统计函数Statistical Functions 提供与数据科学相关的各种统计方法。Provide a wide variety of statistical methods related to data science. 汇总数据Summarize Data

机器学习算法Machine learning algorithms

功能Functionality 说明Description 模块Module
回归Regression 预测值。Predict a value. 提升决策树回归Boosted Decision Tree Regression
决策林回归Decision Forest Regression
线性回归Linear Regression
神经网络回归Neural Network Regression
群集功能Clustering 将数据分到一组。Group data together. K 均值聚类分析K-Means Clustering
分类Classification 预测类。Predict a class. 从二进制(双类)或多类算法中进行选择。Choose from binary (two-class) or multiclass algorithms. 多类提升决策树Multiclass Boosted Decision Tree
多类决策林Multiclass Decision Forest
多类逻辑回归Multiclass Logistic Regression
多类神经网络Multiclass Neural Network
“一对多”多类One vs. All Multiclass
双类平均感知器Two-Class Averaged Perceptron
双类提升决策树Two-Class Boosted Decision Tree
双类决策林Two-Class Decision Forest
双类逻辑回归Two-Class Logistic Regression
双类神经网络Two-Class Neural Network
双类支持向量机Two Class Support Vector Machine

用于构建和评估模型的模块Modules for building and evaluating models

功能Functionality 说明Description 模块Module
模型训练Model Training 通过算法运行数据。Run data through the algorithm. 训练群集模型Train Clustering Model
训练模型Train Model
训练 Pytorch 模型Train Pytorch Model
优化模型超参数Tune Model Hyperparameters
模型评分和评估Model Scoring and Evaluation 度量已训练模型的准确度。Measure the accuracy of the trained model. 应用转换Apply Transformation
将数据分配到群集Assign Data to Clusters
交叉验证模型Cross Validate Model
评估模型Evaluate Model
为图像模型评分Score Image Model
评分模型Score Model
Python 语言Python Language 编写代码并将其嵌入到模块中,以便将 Python 与管道集成。Write code and embed it in a module to integrate Python with your pipeline. 创建 Python 模型Create Python Model
执行 Python 脚本Execute Python Script
R 语言R Language 编写代码并将其嵌入到模块中,以便将 R 与管道集成。Write code and embed it in a module to integrate R with your pipeline. 执行 R 脚本Execute R Script
文本分析Text Analytics 提供专用计算工具来处理结构化和非结构化文本。Provide specialized computational tools for working with both structured and unstructured text. 将单词转换为矢量Convert Word to Vector
从文本中提取 N 元语法特征Extract N Gram Features from Text
特征哈希Feature Hashing
预处理文本Preprocess Text
隐性 Dirichlet 分配Latent Dirichlet Allocation
计算机视觉Computer Vision 与图像数据预处理和图像识别相关的模块。Image data preprocessing and Image recognition related modules. 应用图像转换Apply Image Transformation
转换为图像目录Convert to Image Directory
初始化图像转换Init Image Transformation
拆分为图像目录Split to Image Directory
DenseNetDenseNet
ResNetResNet
建议Recommendation 构建推荐模型。Build recommendation models. 评估推荐器Evaluate Recommender
为 SVD 推荐器评分Score SVD Recommender
为 Wide and Deep 推荐器评分Score Wide and Deep Recommender
训练 SVD 推荐器Train SVD Recommender
训练 Wide and Deep 推荐器Train Wide and Deep Recommender
异常检测Anomaly Detection 构建异常情况检测模型。Build anomaly detection models. 基于 PCA 的异常情况检测PCA-Based Anomaly Detection
训练异常情况检测模型Train Anomaly Detection Model

Web 服务Web service

了解 Azure 机器学习设计器中的实时推理所需的 Web 服务模块Learn about the web service modules which are necessary for real-time inference in Azure Machine Learning designer.

错误消息Error messages

了解在 Azure 机器学习设计器中使用模块时可能会遇到的错误消息和异常代码Learn about the error messages and exception codes you might encounter using modules in Azure Machine Learning designer.

后续步骤Next steps