在 .NET 中使用 AutoML ONNX 模型进行预测Make predictions with an AutoML ONNX model in .NET

在本文中,你将学习如何使用自动化 ML (AutoML) Open Neural Network Exchange (ONNX) 模型在 C# .NET Core 控制台应用程序中通过 ML.NET 进行预测。In this article, you learn how to use an Automated ML (AutoML) Open Neural Network Exchange (ONNX) model to make predictions in a C# .NET Core console application with ML.NET.

ML.NET 是一种面向 .NET 生态系统的跨平台的开源机器学习框架,借助它可通过 C# 或 F# 的代码优先方法以及通过 Model BuilderML.NET CLI 等低代码工具来训练和使用自定义机器学习模型。ML.NET is an open-source, cross-platform, machine learning framework for the .NET ecosystem that allows you to train and consume custom machine learning models using a code-first approach in C# or F# as well as through low-code tooling like Model Builder and the ML.NET CLI. 此框架还可扩展,使你能够利用其他常见的机器学习框架,如 TensorFlow 和 ONNX。The framework is also extensible and allows you to leverage other popular machine learning frameworks like TensorFlow and ONNX.

ONNX 是 AI 模型的开源格式。ONNX is an open-source format for AI models. ONNX 支持框架之间的互操作性。ONNX supports interoperability between frameworks. 这意味着你可在众多常见的机器学习框架(如 PyTorch)中训练模型,将其转换为 ONNX 格式,并在 ML.NET 等不同框架中使用 ONNX 模型。This means you can train a model in one of the many popular machine learning frameworks like PyTorch, convert it into ONNX format, and consume the ONNX model in a different framework like ML.NET. 有关详细信息,请参阅 ONNX 网站To learn more, visit the ONNX website.

先决条件Prerequisites

创建 C# 控制台应用程序Create a C# console application

在此示例中,你将使用 .NET Core CLI 生成应用程序,但可使用 Visual Studio 执行相同的任务。In this sample, you use the .NET Core CLI to build your application but you can do the same tasks using Visual Studio. 详细了解 .NET Core CLILearn more about the .NET Core CLI.

  1. 打开终端并创建新的 C# .NET Core 控制台应用程序。Open a terminal and create a new C# .NET Core console application. 在本例中,应用程序的名称是 AutoMLONNXConsoleAppIn this example, the name of the application is AutoMLONNXConsoleApp. 目录将使用与应用程序的内容相同的名称创建。A directory is created by that same name with the contents of your application.

    dotnet new console -o AutoMLONNXConsoleApp
    
  2. 在终端中,导航到 AutoMLONNXConsoleApp 目录。In the terminal, navigate to the AutoMLONNXConsoleApp directory.

    cd AutoMLONNXConsoleApp
    

添加软件包Add software packages

  1. 使用 .NET Core CLI 安装 Microsoft.ML、Microsoft.ML.OnnxRuntime 和 Microsoft.ML.OnnxTransformer NuGet 包 。Install the Microsoft.ML, Microsoft.ML.OnnxRuntime, and Microsoft.ML.OnnxTransformer NuGet packages using the .NET Core CLI.

    dotnet add package Microsoft.ML
    dotnet add package Microsoft.ML.OnnxRuntime
    dotnet add package Microsoft.ML.OnnxTransformer
    

    这些包中有在 .NET 应用程序中使用 ONNX 模型所需的依赖项。These packages contain the dependencies required to use an ONNX model in a .NET application. ML.NET 提供一个使用 ONNX 运行时进行预测的 API。ML.NET provides an API that uses the ONNX runtime for predictions.

  2. 打开 Program.cs 文件,并在顶部添加以下 using 语句来引用相应的包。Open the Program.cs file and add the following using statements at the top to reference the appropriate packages.

    using System.Linq;
    using Microsoft.ML;
    using Microsoft.ML.Data;
    using Microsoft.ML.Transforms.Onnx;
    

添加对 ONNX 模型的引用Add a reference to the ONNX model

控制台应用程序访问 ONNX 模型的一种方法是将其添加到生成输出目录。A way for the console application to access the ONNX model is to add it to the build output directory. 若要详细了解 MSBuild 常见项,请参阅 MSBuild 指南To learn more about MSBuild common items, see the MSBuild guide.

在应用程序添加对 ONNX 模型文件的引用Add a reference to your ONNX model file in your application

  1. 将 ONNX 模型复制到应用程序的 AutoMLONNXConsoleApp 根目录。Copy your ONNX model to your application's AutoMLONNXConsoleApp root directory.

  2. 打开 AutoMLONNXConsoleApp.csproj 文件,并在 Project 节点中添加以下内容。Open the AutoMLONNXConsoleApp.csproj file and add the following content inside the Project node.

    <ItemGroup>
        <None Include="automl-model.onnx">
            <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
        </None>
    </ItemGroup>
    

    在本例中,ONNX 模型文件的名称为 automl-model.onnx。In this case, the name of the ONNX model file is automl-model.onnx.

  3. 打开 Program.cs 文件,并在 Program 类中添加以下行。Open the Program.cs file and add the following line inside the Program class.

    static string ONNX_MODEL_PATH = "automl-model.onnx";
    

初始化 MLContextInitialize MLContext

Program 类的 Main 方法中,创建 MLContext 的新实例。Inside the Main method of your Program class, create a new instance of MLContext.

MLContext mlContext = new MLContext();

MLContext 类是所有 ML.NET 操作的起点,而初始化 mlContext 会创建一个新的 ML.NET 环境,该环境可在模型生命周期内共享。The MLContext class is a starting point for all ML.NET operations, and initializing mlContext creates a new ML.NET environment that can be shared across the model lifecycle. 在概念上,它类似于实体框架中的 DbContext。It's similar, conceptually, to DbContext in Entity Framework.

定义模型数据架构Define the model data schema

你的模型需要特定格式的输入和输出数据。Your model expects your input and output data in a specific format. 使用 ML.NET,可通过类定义数据的格式。ML.NET allows you to define the format of your data via classes. 有时,你可能已经知道该格式是怎样的。Sometimes you may already know what that format looks like. 如果不知道数据格式,可使用 Netron 之类的工具来检查 ONNX 模型。In cases when you don't know the data format, you can use tools like Netron to inspect your ONNX model.

本示例中所用的模型使用来自“纽约 TLC 出租车行程”数据集的数据。The model used in this sample uses data from the NYC TLC Taxi Trip dataset. 下面是数据的示例:A sample of the data can be seen below:

vendor_idvendor_id rate_coderate_code passenger_countpassenger_count trip_time_in_secstrip_time_in_secs trip_distancetrip_distance payment_typepayment_type fare_amountfare_amount
VTSVTS 11 11 11401140 3.753.75 CRDCRD 15.515.5
VTSVTS 11 11 480480 2.722.72 CRDCRD 10.010.0
VTSVTS 11 11 16801680 7.87.8 CSHCSH 26.526.5

检查 ONNX 模型(可选)Inspect the ONNX model (optional)

使用 Netron 之类的工具来检查模型的输入和输出。Use a tool like Netron to inspect your model's inputs and outputs.

  1. 打开 Netron。Open Netron.

  2. 在顶部菜单栏中选择“文件”>“打开”,然后使用文件浏览器选择模型。In the top menu bar, select File > Open and use the file browser to select your model.

  3. 这将打开你的模型。Your model opens. 例如,automl-model.onnx 模型的结构如下所示:For example, the structure of the automl-model.onnx model looks like the following:

    Netron AutoML ONNX 模型

  4. 选择图形底部的最后一个节点(在本例中为 variable_out1),显示模型的元数据。Select the last node at the bottom of the graph (variable_out1 in this case) to display the model's metadata. 侧栏中的输入和输出显示模型的预期输入、输出和数据类型。The inputs and outputs on the sidebar show you the model's expected inputs, outputs, and data types. 使用此信息可定义模型的输入和输出架构。Use this information to define the input and output schema of your model.

定义模型输入架构Define model input schema

在 Program.cs 文件中使用以下属性创建一个名为 OnnxInput 的新类。Create a new class called OnnxInput with the following properties inside the Program.cs file.

public class OnnxInput
{
    [ColumnName("vendor_id")]
    public string VendorId { get; set; }

    [ColumnName("rate_code"),OnnxMapType(typeof(Int64),typeof(Single))]
    public Int64 RateCode { get; set; }

    [ColumnName("passenger_count"), OnnxMapType(typeof(Int64), typeof(Single))]
    public Int64 PassengerCount { get; set; }

    [ColumnName("trip_time_in_secs"), OnnxMapType(typeof(Int64), typeof(Single))]
    public Int64 TripTimeInSecs { get; set; }

    [ColumnName("trip_distance")]
    public float TripDistance { get; set; }

    [ColumnName("payment_type")]
    public string PaymentType { get; set; }
}

每个属性都将映射到数据集中的一个列。Each of the properties maps to a column in the dataset. 属性将用特性进一步批注。The properties are further annotated with attributes.

利用 ColumnName 特性,可指定在操作数据时 ML.NET 应如何引用该列。The ColumnName attribute lets you specify how ML.NET should reference the column when operating on the data. 例如,尽管 TripDistance 属性遵循标准的 .NET 命名约定,但该模型只知道名为 trip_distance 的列或功能。For example, although the TripDistance property follows standard .NET naming conventions, the model only knows of a column or feature known as trip_distance. 为了解决这种命名差异,ColumnName 特性会按名称 trip_distanceTripDistance 属性映射到列或功能。To address this naming discrepancy, the ColumnName attribute maps the TripDistance property to a column or feature by the name trip_distance.

对于数值,ML.NET 仅对 Single 值类型进行操作。For numerical values, ML.NET only operates on Single value types. 但是,某些列的原始数据类型为整数。However, the original data type of some of the columns are integers. OnnxMapType 特性在 ONNX 和 ML.NET 之间映射类型。The OnnxMapType attribute maps types between ONNX and ML.NET.

若要详细了解数据特性,请参阅 ML.NET 加载数据指南To learn more about data attributes, see the ML.NET load data guide.

定义模型输出架构Define model output schema

数据被处理后,它将生成特定格式的输出。Once the data is processed, it produces an output of a certain format. 定义数据输出架构。Define your data output schema. 在 Program.cs 文件中使用以下属性创建一个名为 OnnxOutput 的新类。Create a new class called OnnxOutput with the following properties inside the Program.cs file.

public class OnnxOutput
{
    [ColumnName("variable_out1")]
    public float[] PredictedFare { get; set; }
}

OnnxInput 类似,使用 ColumnName 特性可将 variable_out1 输出映射到更具描述性的名称 PredictedFareSimilar to OnnxInput, use the ColumnName attribute to map the variable_out1 output to a more descriptive name PredictedFare.

定义预测管道Define a prediction pipeline

ML.NET 中的管道通常是一系列链式转换,它们对输入数据进行操作以生成输出。A pipeline in ML.NET is typically a series of chained transformations that operate on the input data to produce an output. 若要详细了解数据转换,请参阅 ML.NET 数据转换指南To learn more about data transformations, see the ML.NET data transformation guide.

  1. Program 类中创建名为 GetPredictionPipeline 的新方法Create a new method called GetPredictionPipeline inside the Program class

    static ITransformer GetPredictionPipeline(MLContext mlContext)
    {
    
    }
    
  2. 定义输入列和输出列的名称。Define the name of the input and output columns. 将以下代码添加到 GetPredictionPipeline 方法中。Add the following code inside the GetPredictionPipeline method.

    var inputColumns = new string []
    {
        "vendor_id", "rate_code", "passenger_count", "trip_time_in_secs", "trip_distance", "payment_type"
    };
    
    var outputColumns = new string [] { "variable_out1" };
    
  3. 定义管道。Define your pipeline. IEstimator 提供管道的操作、输入和输出架构的蓝图。An IEstimator provides a blueprint of the operations, input, and output schemas of your pipeline.

    var onnxPredictionPipeline =
        mlContext
            .Transforms
            .ApplyOnnxModel(
                outputColumnNames: outputColumns,
                inputColumnNames: inputColumns,
                ONNX_MODEL_PATH);
    

    在本例中,ApplyOnnxModel 是管道中的唯一转换,它采用输入列和输出列的名称以及 ONNX 模型文件的路径。In this case, ApplyOnnxModel is the only transform in the pipeline, which takes in the names of the input and output columns as well as the path to the ONNX model file.

  4. IEstimator 仅定义要应用于数据的操作集。An IEstimator only defines the set of operations to apply to your data. 对数据进行的操作称为 ITransformerWhat operates on your data is known as an ITransformer. 使用 Fit 方法从 onnxPredictionPipeline 创建一个。Use the Fit method to create one from your onnxPredictionPipeline.

    var emptyDv = mlContext.Data.LoadFromEnumerable(new OnnxInput[] {});
    
    return onnxPredictionPipeline.Fit(emptyDv);
    

    Fit 方法需要 IDataView 作为输入以对其执行操作。The Fit method expects an IDataView as input to perform the operations on. IDataView 是一种使用表格格式在 ML.NET 中表示数据的方法。An IDataView is a way to represent data in ML.NET using a tabular format. 在本例中,管道仅用于预测,因此你可提供一个空的 IDataView 来为 ITransformer 提供必要的输入和输出架构信息。Since in this case the pipeline is only used for predictions, you can provide an empty IDataView to give the ITransformer the necessary input and output schema information. 然后,系统会返回合适的 ITransformer,以供今后在应用程序中使用。The fitted ITransformer is then returned for further use in your application.

    提示

    在此示例中,管道在同一应用程序中定义和使用。In this sample, the pipeline is defined and used within the same application. 但建议分别使用单独的应用程序来定义和使用管道进行预测。However, it is recommended that you use separate applications to define and use your pipeline to make predictions. 在 ML.NET 中,可将管道序列化并进行保存,以供今后在其他 .NET 最终用户应用程序中使用。In ML.NET your pipelines can be serialized and saved for further use in other .NET end-user applications. ML.NET 支持各种部署目标,例如桌面应用程序、Web 服务、WebAssembly 应用程序*等。ML.NET supports various deployment targets such as desktop applications, web services, WebAssembly applications*, and many more. 若要详细了解如何保存管道,请参阅“ML.NET 保存和加载经过训练的模型”指南To learn more about saving pipelines, see the ML.NET save and load trained models guide.

    *WebAssembly 仅在 .NET Core 5 或更高版本中受支持*WebAssembly is only supported in .NET Core 5 or greater

  5. Main 方法中,用所需的参数调用 GetPredictionPipeline 方法。Inside the Main method, call the GetPredictionPipeline method with the required parameters.

    var onnxPredictionPipeline = GetPredictionPipeline(mlContext);
    

使用模型进行预测Use the model to make predictions

现在你已有一个管道,可用它来进行预测。Now that you have a pipeline, it's time to use it to make predictions. ML.NET 提供了一个方便的 API,用于在名为 PredictionEngine 的单个数据实例上进行预测。ML.NET provides a convenience API for making predictions on a single data instance called PredictionEngine.

  1. Main 方法中,使用 CreatePredictionEngine 方法创建 PredictionEngineInside the Main method, create a PredictionEngine by using the CreatePredictionEngine method.

    var onnxPredictionEngine = mlContext.Model.CreatePredictionEngine<OnnxInput, OnnxOutput>(onnxPredictionPipeline);
    
  2. 创建测试数据输入。Create a test data input.

    var testInput = new OnnxInput
    {
        VendorId = "CMT",
        RateCode = 1,
        PassengerCount = 1,
        TripTimeInSecs = 1271,
        TripDistance = 3.8f,
        PaymentType = "CRD"
    };
    
  3. 通过 Predict 方法基于新的 testInput 数据,使用 predictionEngine 进行预测。Use the predictionEngine to make predictions based on the new testInput data using the Predict method.

    var prediction = onnxPredictionEngine.Predict(testInput);
    
  4. 将预测结果输出到控制台。Output the result of your prediction to the console.

    Console.WriteLine($"Predicted Fare: {prediction.PredictedFare.First()}");
    
  5. 使用 .NET Core CLI 运行你的应用程序。Use the .NET Core CLI to run your application.

    dotnet run
    

    结果应类似于以下输出:The result should look as similar to the following output:

    Predicted Fare: 15.621523
    

若要详细了解如何在 ML.NET 中进行预测,请参阅“使用模型进行预测”指南To learn more about making predictions in ML.NET, see the use a model to make predictions guide.

后续步骤Next steps