高级入口脚本

重要

本文提供有关使用 Azure 机器学习 SDK v1 的信息。 SDK v1 自 2025 年 3 月 31 日起弃用。对它的支持将于 2026 年 6 月 30 日结束。可以在该日期之前安装和使用 SDK v1。使用 SDK v1 的现有工作流将在支持结束日期后继续运行。但是，在产品发生体系结构更改时，可能会面临安全风险或中断性变更。

建议在 2026 年 6 月 30 日之前过渡到 SDK v2。有关 SDK v2 的详细信息，请参阅什么是 Azure 机器学习 CLI 和 Python SDK v2？以及 SDK v2 参考。

本文介绍如何为 Azure 机器学习中的专用用例编写入口脚本。一个条目脚本，也称为评分脚本，接受请求，使用模型对数据进行评分，并返回响应。

先决条件

打算使用 Azure 机器学习进行部署的已训练机器学习模型。有关模型部署的详细信息，请参阅将机器学习模型部署到 Azure。

自动生成 Swagger 架构

若要为 Web 服务自动生成架构，请在其中一个已定义类型对象的构造函数中提供输入或输出的示例。该类型和示例用于自动创建架构。然后，Azure 机器学习会在部署期间为 Web 服务创建 OpenAPI 规范（以前为 Swagger 规范）。

警告

不要对示例输入或输出使用敏感数据或专用数据。在 Azure 机器学习中，用于推理的 Swagger 页公开示例数据。

目前支持以下类型：

pandas
numpy
pyspark
标准 Python 对象

若要使用架构生成，请在依赖项文件中包括开源 inference-schema 包版本 1.1.0 或更高版本。有关此包的详细信息，请参阅 GitHub 上的 InferenceSchema。若要为自动化 Web 服务使用生成符合要求的 Swagger，评分脚本中的 run 函数必须满足以下条件：

第一个参数必须具有类型 StandardPythonParameterType、命名 Inputs和嵌套。
必须有一个可选的第二个参数类型 StandardPythonParameterType ，该参数的名称 GlobalParameters为。
该函数必须返回一个类型 StandardPythonParameterType 的命名为 Results 并嵌套的字典。

定义 sample_input 和 sample_output 变量中的输入和输出示例格式，它们表示 Web 服务的请求和响应格式。在 run 函数的输入和输出函数修饰器中使用这些示例。 scikit-learn以下部分中的示例使用架构生成。

与 Power BI 兼容的终结点

下面的示例演示如何根据上一部分中的说明定义 run 函数。当您在 Power BI 中使用已部署的网页服务时，可以使用此脚本。

import os
import json
import pickle
import numpy as np
import pandas as pd
import azureml.train.automl
import joblib
from sklearn.linear_model import Ridge

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType


def init():
    global model
    # Replace the file name if needed.
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')
    # Deserialize the model file back into a sklearn model.
    model = joblib.load(model_path)


# Provide three sample inputs for schema generation.
numpy_sample_input = NumpyParameterType(np.array([[1,2,3,4,5,6,7,8,9,10],[10,9,8,7,6,5,4,3,2,1]],dtype='float64'))
pandas_sample_input = PandasParameterType(pd.DataFrame({'name': ['Sarah', 'John'], 'age': [25, 26]}))
standard_sample_input = StandardPythonParameterType(0.0)

# The following sample is a nested input sample. Any item wrapped by `ParameterType` is described by the schema.
sample_input = StandardPythonParameterType({'input1': numpy_sample_input, 
                                        'input2': pandas_sample_input, 
                                        'input3': standard_sample_input})

sample_global_parameters = StandardPythonParameterType(1.0) # This line is optional.
sample_output = StandardPythonParameterType([1.0, 1.0])
outputs = StandardPythonParameterType({'Results':sample_output}) # "Results" is case sensitive.

@input_schema('Inputs', sample_input) 
# "Inputs" is case sensitive.

@input_schema('GlobalParameters', sample_global_parameters) 
# The preceding line is optional. "GlobalParameters" is case sensitive.

@output_schema(outputs)

def run(Inputs, GlobalParameters): 
    # The parameters in the preceding line have to match those in the decorator. "Inputs" and 
    # "GlobalParameters" are case sensitive.
    try:
        data = Inputs['input1']
        # The data gets converted to the target format.
        assert isinstance(data, np.ndarray)
        result = model.predict(data)
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

提示

脚本中的返回值可以是可序列化为 JSON 的任何 Python 对象。例如，如果模型返回包含多个列的 Pandas 数据帧，则可以使用类似于以下代码的输出修饰器：

output_sample = pd.DataFrame(data=[{"a1": 5, "a2": 6}])
@output_schema(PandasParameterType(output_sample))
...
result = model.predict(data)
return result

二进制（图像）数据

如果模型接受二进制数据（如映像），则必须修改部署使用的 score.py 文件，以便它接受原始 HTTP 请求。若要接受原始数据，请在入口脚本中使用 AMLRequest 类，并向 @rawhttp 函数添加 run 修饰器。

以下 score.py 脚本接受二进制数据：

from azureml.contrib.services.aml_request import AMLRequest, rawhttp
from azureml.contrib.services.aml_response import AMLResponse
from PIL import Image
import json

def init():
    print("This is init()")

@rawhttp
def run(request):
    print("This is run()")
    
    if request.method == 'GET':
        # For this example, return the URL for GET requests.
        respBody = str.encode(request.full_path)
        return AMLResponse(respBody, 200)
    elif request.method == 'POST':
        file_bytes = request.files["image"]
        image = Image.open(file_bytes).convert('RGB')
        # For a real-world solution, load the data from the request body
        # and send it to the model. Then return the response.

        # For demonstration purposes, this example returns the size of the image as the response.
        return AMLResponse(json.dumps(image.size), 200)
    else:
        return AMLResponse("bad request", 500)

重要

AMLRequest 类位于 azureml.contrib 命名空间中。此命名空间中的实体目前处于预览状态。服务进行改进时，它们会频繁更改。 Microsoft不提供对这些实体的完全支持。

如果需要在本地开发环境中测试使用此类的代码，可以使用以下命令安装组件：

pip install azureml-contrib-services

注意

我们不建议使用500作为自定义状态代码。在 Azure 机器学习推理路由器（azureml-fe）端，状态代码将重写为 502。

状态代码通过 azureml-fe 传递，然后发送到客户端。
代码 azureml-fe 将模型端返回的 500 重新写为 502。客户端收到 502 的代码。
azureml-fe如果代码本身返回500，客户端仍会收到500状态码。

使用 AMLRequest 类时，只能访问 score.py 文件中的原始已发布数据。没有客户端组件。在客户端中，可以照常发布数据。例如，以下 Python 代码读取图像文件并发布数据：

import requests

uri = service.scoring_uri
image_path = 'test.jpg'
files = {'image': open(image_path, 'rb').read()}
response = requests.post(uri, files=files)

print(response.json)

跨域资源共享（CORS）为从另一个域请求网页上的资源提供了一种方法。 CORS 通过通过客户端请求发送并随服务响应返回的 HTTP 标头工作。有关 CORS 和有效标头的详细信息，请参阅跨域资源共享。

若要配置模型部署以支持 CORS，请在入口脚本中使用 AMLResponse 类。使用此类时，可以在响应对象上设置标头。

以下示例在入口脚本中设置响应的 Access-Control-Allow-Origin 标头：

from azureml.contrib.services.aml_request import AMLRequest, rawhttp
from azureml.contrib.services.aml_response import AMLResponse


def init():
    print("This is init()")

@rawhttp
def run(request):
    print("This is run()")
    print("Request: [{0}]".format(request))
    if request.method == 'GET':
        # For this example, just return the URL for GET.
        # For a real-world solution, you would load the data from URL params or headers
        # and send it to the model. Then return the response.
        respBody = str.encode(request.full_path)
        resp = AMLResponse(respBody, 200)
        resp.headers["Allow"] = "OPTIONS, GET, POST"
        resp.headers["Access-Control-Allow-Methods"] = "OPTIONS, GET, POST"
        resp.headers['Access-Control-Allow-Origin'] = "http://www.example.com"
        resp.headers['Access-Control-Allow-Headers'] = "*"
        return resp
    elif request.method == 'POST':
        reqBody = request.get_data(False)
        # For a real-world solution, you would load the data from reqBody
        # and send it to the model. Then return the response.
        resp = AMLResponse(reqBody, 200)
        resp.headers["Allow"] = "OPTIONS, GET, POST"
        resp.headers["Access-Control-Allow-Methods"] = "OPTIONS, GET, POST"
        resp.headers['Access-Control-Allow-Origin'] = "http://www.example.com"
        resp.headers['Access-Control-Allow-Headers'] = "*"
        return resp
    elif request.method == 'OPTIONS':
        resp = AMLResponse("", 200)
        resp.headers["Allow"] = "OPTIONS, GET, POST"
        resp.headers["Access-Control-Allow-Methods"] = "OPTIONS, GET, POST"
        resp.headers['Access-Control-Allow-Origin'] = "http://www.example.com"
        resp.headers['Access-Control-Allow-Headers'] = "*"
        return resp
    else:
        return AMLResponse("bad request", 400)

重要

如果需要在本地开发环境中测试使用此类的代码，可以使用以下命令安装组件：

pip install azureml-contrib-services

警告

Azure 机器学习仅将 POST 和 GET 请求路由到运行评分服务的容器。如果浏览器使用 HTTP OPTIONS 请求发出预检请求，则可能导致错误。

加载已注册的模型

可以通过两种方法在入口脚本中查找模型：

AZUREML_MODEL_DIR：包含模型位置路径的环境变量
Model.get_model_path：使用已注册的模型名称返回模型文件的路径的 API

AZUREML_MODEL_DIR

AZUREML_MODEL_DIR 是服务部署期间创建的环境变量。可以使用此环境变量查找已部署模型的位置。

下表描述了不同数量的已部署模型的 AZUREML_MODEL_DIR 的可能值：

部署	环境变量值
单个模型	包含模型的文件夹的路径。
多个模型	包含所有模型的文件夹的路径。模型位于此文件夹中，格式为`<model-name>/<version>`，按名称和版本定位。

在模型注册和部署期间，模型放置在路径中 AZUREML_MODEL_DIR ，并保留其原始文件名。

若要在入口脚本中获取某个模型文件的路径，请将此环境变量与要查找的文件路径组合在一起。

单个模型

以下示例演示如何在具有单个模型时查找路径：

import os

# In the following example, the model is a file.
model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')

# In the following example, the model is a folder that contains a file.
file_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'my_model_folder', 'sklearn_regression_model.pkl')

多个模型

以下示例演示如何在有多个模型时查找路径。在此方案中，向工作区注册两个模型：

my_first_model：此模型包含一个文件，my_first_model.pkl，并且有一个版本 1。
my_second_model：此模型包含一个文件，my_second_model.pkl，并且有两个版本， 1 以及 2。

部署服务时，请在部署作中提供这两个模型：

from azureml.core import Workspace, Model

# Get a handle to the workspace.
ws = Workspace.from_config()

first_model = Model(ws, name="my_first_model", version=1)
second_model = Model(ws, name="my_second_model", version=2)
service = Model.deploy(ws, "myservice", [first_model, second_model], inference_config, deployment_config)

在承载服务的 Docker 映像中， AZUREML_MODEL_DIR 环境变量包含模型所在的文件夹。在此文件夹中，每个模型都位于文件夹路径中 <model-name>/<version>。在此路径中， <model-name> 是已注册模型的名称，也是 <version> 模型的版本。构成已注册模型的文件存储在这些文件夹中。

在此示例中，第一个模型的路径为 $AZUREML_MODEL_DIR/my_first_model/1/my_first_model.pkl. 第二个模型的路径为 $AZUREML_MODEL_DIR/my_second_model/2/my_second_model.pkl.

# In the following example, the model is a file, and the deployment contains multiple models.
first_model_name = 'my_first_model'
first_model_version = '1'
first_model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), first_model_name, first_model_version, 'my_first_model.pkl')
second_model_name = 'my_second_model'
second_model_version = '2'
second_model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), second_model_name, second_model_version, 'my_second_model.pkl')

获取模型路径

注册模型时，请提供用于在注册表中管理该模型的模型名称。将此名称与方法一起使用 Model.get_model_path ，以检索本地文件系统上的模型文件或文件的路径。如果注册文件夹或文件集合，此 API 将返回包含这些文件的文件夹的路径。

注册模型时，请为其指定一个名称。该名称对应于模型的放置位置（本地位置或在服务部署过程中指定的位置）。

特定于框架的示例

有关特定机器学习用例的更多入口脚本示例，请参阅以下文章：

Last updated on 2026-01-04