predict_onnx_fl()predict_onnx_fl()

函数 predict_onnx_fl() 使用现有的已定型机器学习模型进行预测。The function predict_onnx_fl() predicts using an existing trained machine learning model. 此模型已转换为 ONNX 格式,已序列化为字符串,并已保存在标准的 Azure 数据资源管理器表中。This model has been converted to ONNX format, serialized to string, and saved in a standard Azure Data Explorer table.

备注

语法Syntax

T | invoke predict_onnx_fl(models_tbl, model_name, features_cols, pred_col)T | invoke predict_onnx_fl(models_tbl, model_name, features_cols, pred_col)

参数Arguments

  • models_tbl:包含所有序列化模型的表的名称。models_tbl: The name of the table containing all serialized models. 此表必须包含以下列:This table must contain the following columns:
    • name:模型名称name: the model name
    • timestamp:模型训练时间timestamp: time of model training
    • model:序列化模型的字符串表示形式model: string representation of the serialized model
  • model_name:要使用的特定模型的名称。model_name: The name of the specific model to use.
  • features_cols:动态数组,其中包含供模型用来预测的特征列的名称。features_cols: Dynamic array containing the names of the features columns that are used by the model for prediction.
  • pred_col:存储预测的列的名称。pred_col: The name of the column that stores the predictions.

使用情况Usage

predict_onnx_fl() 是用户定义的表格函数,将使用 invoke 运算符来应用。predict_onnx_fl() is a user-defined tabular function to be applied using the invoke operator. 可以在查询中嵌入其代码,或将其安装在数据库中。You can either embed its code in your query, or install it in your database. 用法选项有两种:临时使用和永久使用。There are two usage options: ad hoc and persistent usage. 有关示例,请参阅下面的选项卡。See the below tabs for examples.

对于临时使用,请使用 let 语句嵌入代码。For ad hoc usage, embed the code using the let statement. 不需要权限。No permission is required.

let predict_onnx_fl=(samples:(*), models_tbl:(name:string, timestamp:datetime, model:string), model_name:string, features_cols:dynamic, pred_col:string)
{
    let model_str = toscalar(models_tbl | where name == model_name | top 1 by timestamp desc | project model);
    let kwargs = pack('smodel', model_str, 'features_cols', features_cols, 'pred_col', pred_col);
    let code =
    '\n'
    'import binascii\n'
    '\n'
    'smodel = kargs["smodel"]\n'
    'features_cols = kargs["features_cols"]\n'
    'pred_col = kargs["pred_col"]\n'
    'bmodel = binascii.unhexlify(smodel)\n'
    '\n'
    'features_cols = kargs["features_cols"]\n'
    'pred_col = kargs["pred_col"]\n'
    '\n'
    'import onnxruntime as rt\n'
    'sess = rt.InferenceSession(bmodel)\n'
    'input_name = sess.get_inputs()[0].name\n'
    'label_name = sess.get_outputs()[0].name\n'
    'df1 = df[features_cols]\n'
    'predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]\n'
    '\n'
    'result = df\n'
    'result[pred_col] = pd.DataFrame(predictions, columns=[pred_col])'
    '\n'
    ;
    samples | evaluate python(typeof(*), code, kwargs)
};
//
// Predicts room occupancy from sensors measurements, and calculates the confusion matrix
//
// Occupancy Detection is an open dataset from UCI Repository at https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+
// It contains experimental data for binary classification of room occupancy from Temperature,Humidity,Light and CO2.
// Ground-truth labels were obtained from time stamped pictures that were taken every minute
//
OccupancyDetection 
| where Test == 1
| extend pred_Occupancy=bool(0)
| invoke predict_onnx_fl(ML_Models, 'ONNX-Occupancy', pack_array('Temperature', 'Humidity', 'Light', 'CO2', 'HumidityRatio'), 'pred_Occupancy')
| summarize n=count() by Occupancy, pred_Occupancy

混淆矩阵:Confusion matrix:

Occupancy   pred_Occupancy  n
TRUE        TRUE            3006
FALSE       TRUE            112
TRUE        FALSE           15
FALSE       FALSE           9284