双类平均感知器模块Two-Class Averaged Perceptron module

本文介绍 Azure 机器学习设计器中的一个模块。This article describes a module in Azure Machine Learning designer.

使用此模块,可以根据平均感知器算法创建机器学习模型。Use this module to create a machine learning model based on the averaged perceptron algorithm.

此分类算法是一种监督式学习方法,需要一个标记的数据集(其中包含标签列)。 This classification algorithm is a supervised learning method, and requires a tagged dataset , which includes a label column. 可以对模型进行训练,只需提供模型和标记的数据集作为训练模型的输入,You can train the model by providing the model and the tagged dataset as an input to Train Model. 然后即可使用训练的模型来预测新输入示例的值。The trained model can then be used to predict values for the new input examples.

关于平均感知器模型About averaged perceptron models

平均感知器方法是神经网络的早期简单版本。 The averaged perceptron method is an early and simple version of a neural network. 在此方法中,我们根据一个线性函数将输入分类为多个可能的输出,然后将其与一组派生自特征矢量的权重相结合,这也是“感知器”这一名称的由来。In this approach, inputs are classified into several possible outputs based on a linear function, and then combined with a set of weights that are derived from the feature vector—hence the name "perceptron."

较简单的感知器模型适用于学习线性可分模式,而神经网络(尤其是深度神经网络)则可对较复杂的类边界建模。The simpler perceptron models are suited to learning linearly separable patterns, whereas neural networks (especially deep neural networks) can model more complex class boundaries. 但是,感知器速度更快,并且由于它们是按顺序处理案例的,因此可以将它们用于连续训练。However, perceptrons are faster, and because they process cases serially, perceptrons can be used with continuous training.

如何配置双类平均感知器How to configure Two-Class Averaged Perceptron

  1. 向管道添加“双类平均感知器” 模块。Add the Two-Class Averaged Perceptron module to your pipeline.

  2. 通过设置“创建训练程序模式”选项,指定所希望的模型训练方式。 Specify how you want the model to be trained, by setting the Create trainer mode option.

    • 单个参数 :如果知道自己想要如何配置模型,请提供一组特定的值作为参数。Single Parameter : If you know how you want to configure the model, provide a specific set of values as arguments.

    • 参数范围 :如果不确定最佳参数并想要运行参数整理,请选择此选项。Parameter Range : Select this option if you are not sure of the best parameters, and want to run a parameter sweep. 选择要循环访问的值范围,优化模型超参数将循环访问所提供设置的所有可能组合,以确定产生最佳结果的超参数。Select a range of values to iterate over, and the Tune Model Hyperparameters iterates over all possible combinations of the settings you provided to determine the hyperparameters that produce the optimal results.

  3. 对于“学习速率”,请指定一个值作为“学习速率”。 For Learning rate , specify a value for the learning rate . 学习速率值控制每次对模型进行测试和纠正时用于随机梯度下降法的梯度的大小。The learning rate values control the size of the step that is used in stochastic gradient descent each time the model is tested and corrected.

    降低该速率可以加快模型测试频率,但风险是可能会在局部出现拟合效果改进缓慢的情况。By making the rate smaller, you test the model more often, with the risk that you might get stuck in a local plateau. 加大梯度可以加快聚合速度,但风险是可能会错过真正的最小值。By making the step larger, you can converge faster, at the risk of overshooting the true minima.

  4. 对于“最大迭代次数”,请键入你希望算法检查训练数据的次数。 For Maximum number of iterations , type the number of times you want the algorithm to examine the training data.

    早停止通常提供更好的通用化。Stopping early often provides better generalization. 提高迭代次数可以改进拟合情况,风险是过度拟合。Increasing the number of iterations improves fitting, at the risk of overfitting.

  5. 对于“随机数种子”,可以选择键入一个整数值,将其用作种子。 For Random number seed , optionally type an integer value to use as the seed. 如果需要跨运行确保管道的可再现性,建议使用种子。Using a seed is recommended if you want to ensure reproducibility of the pipeline across runs.

  6. 连接训练数据集,然后训练模型:Connect a training dataset, and train the model:

    • 如果将“创建训练程序模式”设置为“单个参数”,请连接带标记的数据集和 。If you set Create trainer mode to Single Parameter , connect a tagged dataset and the Train Model module.

    • 如果将“创建训练程序模式”设置为“参数范围”,请连接带标记的数据集并使用 。If you set Create trainer mode to Parameter Range , connect a tagged dataset and train the model by using Tune Model Hyperparameters.

    备注

    如果将参数范围传递给训练模型,则它只使用单个参数列表中的默认值。If you pass a parameter range to Train Model, it uses only the default value in the single parameter list.

    如果将一组参数值传递给优化模型超参数模块,则当它期望每个参数有一系列设置时,它会忽略这些值,并为学习器使用默认值。If you pass a single set of parameter values to the Tune Model Hyperparameters module, when it expects a range of settings for each parameter, it ignores the values, and uses the default values for the learner.

    如果选择“参数范围”选项并为任何参数输入单个值,则整个整理过程中都会使用你指定的单个值,即使其他参数的值发生一系列更改 。If you select the Parameter Range option and enter a single value for any parameter, that single value you specified is used throughout the sweep, even if other parameters change across a range of values.

后续步骤Next steps

请参阅 Azure 机器学习的可用模块集See the set of modules available to Azure Machine Learning.