使用 Visual Studio Code 进行交互式调试Interactive debugging with Visual Studio Code

了解如何使用 Visual Studio Code (VS Code) 和 debugpy 以交互方式调试 Azure 机器学习试验、管道和部署。Learn how to interactively debug Azure Machine Learning experiments, pipelines, and deployments using Visual Studio Code (VS Code) and debugpy.

在本地运行和调试试验Run and debug experiments locally

将机器学习试验提交到云之前,使用 Azure 机器学习扩展来验证、运行和调试它们。Use the Azure Machine Learning extension to validate, run, and debug your machine learning experiments before submitting them to the cloud.

先决条件Prerequisites

备注

在 Windows 上,确保将 Docker 配置为使用 Linux 容器On Windows, make sure to configure Docker to use Linux containers.

提示

对于 Windows,虽然不是必需的,但强烈建议将 Docker 与适用于 Linux 的 Windows 子系统 (WSL) 2 配合使用For Windows, although not required, it's highly recommended to use Docker with Windows Subsystem for Linux (WSL) 2.

重要

在本地运行试验之前,确保 Docker 正在运行。Before running your experiment locally, make sure that Docker is running.

在本地调试试验Debug experiment locally

  1. 在 VS Code 中打开 Azure 机器学习扩展视图。In VS Code, open the Azure Machine Learning extension view.

  2. 展开包含你的工作区的订阅节点。Expand the subscription node containing your workspace. 如果还没有工作区,可使用该扩展创建一个 Azure 机器学习工作区If you don't already have one, you can create an Azure Machine Learning workspace using the extension.

  3. 展开你的工作区节点。Expand your workspace node.

  4. 右键单击“试验”节点,然后选择“创建试验”。 Right-click the Experiments node and select Create experiment. 出现提示时,为你的试验提供一个名称。When the prompt appears, provide a name for your experiment.

  5. 展开“试验”节点,右键单击要运行的试验,然后选择“运行试验”。 Expand the Experiments node, right-click the experiment you want to run and select Run Experiment.

  6. 从运行试验的选项列表中,选择“本地”。From the list of options to run your experiment, select Locally.

  7. 仅在 Windows 上首次使用。First time use on Windows only. 提示你是否允许文件共享时,选择“是”。When prompted to allow File Share, select Yes. 当你启用文件共享时,它允许 Docker 将包含脚本的目录装载到容器中。When you enable file share it allows Docker to mount the directory containing your script to the container. 此外,它还允许 Docker 将运行中的日志和输出存储在系统上的临时目录中。Additionally, it also allows Docker to store the logs and outputs from your run in a temporary directory on your system.

  8. 选择“是”调试试验。Select Yes to debug your experiment. 否则请选择“否” 。Otherwise, select No. 选择“否”将在本地运行试验,而不会附加到调试器。Selecting no will run your experiment locally without attaching to the debugger.

  9. 选择“新建运行配置”以创建运行配置。Select Create new Run Configuration to create your run configuration. 运行配置定义要运行的脚本、依赖项和使用的数据集。The run configuration defines the script you want to run, dependencies, and datasets used. 或者,如果你已有一个运行配置,请从下拉列表中选择它。Alternatively, if you already have one, select it from the dropdown.

    1. 选择环境。Choose your environment. 可从任何 Azure 机器学习策展中选择,也可自行创建。You can choose from any of the Azure Machine Learning curated or create your own.
    2. 提供要运行的脚本名。Provide the name of the script you want to run. 该路径相对于在 VS Code 中打开的目录。The path is relative to the directory opened in VS Code.
    3. 选择是否要使用 Azure 机器学习数据集。Choose whether you want to use an Azure Machine Learning dataset or not. 可使用扩展创建 Azure 机器学习数据集You can create Azure Machine Learning datasets using the extension.
    4. 为了将调试器附加到运行试验的容器,需要使用 Debugpy。Debugpy is required in order to attach the debugger to the container running your experiment. 若要将 debugpy 添加为依赖项,请选择“添加 Debugpy”。To add debugpy as a dependency,select Add Debugpy. 否则,请选择“跳过”。Otherwise, select Skip. 如果不将 debugpy 添加为依赖项,则可在不附加到调试器的情况下运行试验。Not adding debugpy as a dependency runs your experiment without attaching to the debugger.
    5. 此时会在编辑器中打开一个包含运行配置设置的配置文件。A configuration file containing your run configuration settings opens in the editor. 如果对设置感到满意,请选择“提交试验”。If you're satisfied with the settings, select Submit experiment. 或者,可从菜单栏打开命令面板(“查看”>“命令面板”),然后在文本框中输入 Azure ML: Submit experiment 命令。Alternatively, you open the command palette (View > Command Palette) from the menu bar and enter the Azure ML: Submit experiment command into the text box.
  10. 提交试验后,会创建一个包含脚本和运行配置中指定的配置的 Docker 映像。Once your experiment is submitted, a Docker image containing your script and the configurations specified in your run configuration is created.

    当 Docker 映像生成过程开始时,60_control_log.txt 文件的内容将流式传输到 VS Code 中的输出控制台。When the Docker image build process begins, the contents of the 60_control_log.txt file stream to the output console in VS Code.

    备注

    第一次创建 Docker 映像时,可能需要几分钟时间。The first time your Docker image is created can take several minutes.

  11. 映像生成后,会出现一个启动调试器的提示。Once your image is built, a prompt appears to start the debugger. 在脚本中设置断点,并在准备开始调试时选择“启动调试器”。Set your breakpoints in your script and select Start debugger when you're ready to start debugging. 这样做可将 VS Code 调试器附加到运行试验的容器。Doing so attaches the VS Code debugger to the container running your experiment. 或者,在 Azure 机器学习扩展中,将鼠标悬停在当前运行的节点上,然后选择“播放”图标来启动调试器。Alternatively, in the Azure Machine Learning extension, hover over the node for your current run and select the play icon to start the debugger.

    重要

    单个试验不能有多个调试会话。You cannot have multiple debug sessions for a single experiment. 但可使用多个 VS Code 实例来调试多个试验。You can however debug two or more experiments using multiple VS Code instances.

此时,你应该能使用 VS Code 来逐步执行和调试代码。At this point, you should be able to step-through and debug your code using VS Code.

如果要取消运行,请右键单击运行节点,然后选择“取消运行”。If at any point you want to cancel your run, right-click your run node and select Cancel run.

与远程试验运行类似,你可展开运行节点来检查日志和输出。Similar to remote experiment runs, you can expand your run node to inspect the logs and outputs.

提示

使用环境中定义的相同依赖项的 Docker 映像将在运行之间重复使用。Docker images that use the same dependencies defined in your environment are reused between runs. 但是,如果使用新的或不同的环境运行试验,则会创建一个新映像。However, if you run an experiment using a new or different environment, a new image is created. 由于这些映像会保存到本地存储,因此建议删除旧的或未使用的 Docker 映像。Since these images are saved to your local storage, it's recommended to remove old or unused Docker images. 若要从系统中删除映像,请使用 Docker CLIVS Code Docker 扩展To remove images from your system, use the Docker CLI or the VS Code Docker extension.

对机器学习管道进行调试和故障排除Debug and troubleshoot machine learning pipelines

在某些情况下,可能需要以交互方式调试 ML 管道中使用的 Python 代码。In some cases, you may need to interactively debug the Python code used in your ML pipeline. 通过使用 VS Code 和 debugpy,可以在代码在训练环境中运行时附加到该代码。By using VS Code and debugpy, you can attach to the code as it runs in the training environment.

先决条件Prerequisites

  • 一个配置为使用 Azure 虚拟网络的 Azure 机器学习工作区。 An Azure Machine Learning workspace that is configured to use an Azure Virtual Network.

  • 一个使用 Python 脚本作为管道步骤的一部分的 Azure 机器学习管道。An Azure Machine Learning pipeline that uses Python scripts as part of the pipeline steps. 例如 PythonScriptStep。For example, a PythonScriptStep.

  • 一个位于虚拟网络中并供管道用来训练的 Azure 机器学习计算群集。 An Azure Machine Learning Compute cluster, which is in the virtual network and is used by the pipeline for training.

  • 一个位于虚拟网络中的开发环境。 A development environment that is in the virtual network. 该开发环境可以是下列其中一项:The development environment might be one of the following:

    • 虚拟网络中的 Azure 虚拟机An Azure Virtual Machine in the virtual network
    • 虚拟网络中笔记本 VM 的计算实例A Compute instance of Notebook VM in the virtual network
    • 通过 VPN 或 ExpressRoute 与虚拟网络建立了专用网络连接的客户端计算机。A client machine that has private network connectivity to the virtual network, either by VPN or via ExpressRoute.

提示

虽然可以使用不在虚拟网络后面的 Azure 机器学习资源,但仍建议使用虚拟网络。Although you can work with Azure Machine Learning resources that are not behind a virtual network, using a virtual network is recommended.

工作原理How it works

ML 管道步骤运行 Python 脚本。Your ML pipeline steps run Python scripts. 可修改这些脚本来执行以下操作:These scripts are modified to perform the following actions:

  1. 记录运行这些脚本的主机的 IP 地址。Log the IP address of the host that they are running on. 使用 IP 地址将调试器连接到脚本。You use the IP address to connect the debugger to the script.

  2. 启动 debugpy 调试组件,并等待调试程序建立连接。Start the debugpy debug component, and wait for a debugger to connect.

  3. 在开发环境中,监视训练过程创建的日志,以查找运行脚本的 IP 地址。From your development environment, you monitor the logs created by the training process to find the IP address where the script is running.

  4. 使用 launch.json 文件告知 VS Code 要将调试器连接到哪个 IP 地址。You tell VS Code the IP address to connect the debugger to by using a launch.json file.

  5. 附加调试器并以交互方式逐步运行脚本。You attach the debugger and interactively step through the script.

配置 Python 脚本Configure Python scripts

若要启用调试,请对 ML 管道中的步骤使用的 Python 脚本进行以下更改:To enable debugging, make the following changes to the Python script(s) used by steps in your ML pipeline:

  1. 添加以下 import 语句:Add the following import statements:

    import argparse
    import os
    import debugpy
    import socket
    from azureml.core import Run
    
  2. 添加以下参数。Add the following arguments. 这些参数使你能够按需启用调试器,并设置附加调试器的超时:These arguments allow you to enable the debugger as needed, and set the timeout for attaching the debugger:

    parser.add_argument('--remote_debug', action='store_true')
    parser.add_argument('--remote_debug_connection_timeout', type=int,
                        default=300,
                        help=f'Defines how much time the AML compute target '
                        f'will await a connection from a debugger client (VSCODE).')
    parser.add_argument('--remote_debug_client_ip', type=str,
                        help=f'Defines IP Address of VS Code client')
    parser.add_argument('--remote_debug_port', type=int,
                        default=5678,
                        help=f'Defines Port of VS Code client')
    
  3. 添加以下语句。Add the following statements. 这些语句加载当前运行上下文,使你能够记录运行代码的节点的 IP 地址:These statements load the current run context so that you can log the IP address of the node that the code is running on:

    global run
    run = Run.get_context()
    
  4. 添加一个 if 语句,用于启动 debugpy 并等待调试程序附加完成。Add an if statement that starts debugpy and waits for a debugger to attach. 如果在超时之前未附加任何调试器,脚本将继续正常运行。If no debugger attaches before the timeout, the script continues as normal. 确保用自己的值替换 listen 函数的 HOSTPORT 值。Make sure to replace the HOST and PORT values is the listen function with your own.

    if args.remote_debug:
        print(f'Timeout for debug connection: {args.remote_debug_connection_timeout}')
        # Log the IP and port
        try:
            ip = args.remote_debug_client_ip
        except:
            print("Need to supply IP address for VS Code client")
        print(f'ip_address: {ip}')
        debugpy.listen(address=(ip, args.remote_debug_port))
        # Wait for the timeout for debugger to attach
        debugpy.wait_for_client()
        print(f'Debugger attached = {debugpy.is_client_connected()}')
    

以下 Python 示例演示了用于启用调试的简单 train.py 文件:The following Python example shows a simple train.py file that enables debugging:

# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import argparse
import os
import debugpy
import socket
from azureml.core import Run

print("In train.py")
print("As a data scientist, this is where I use my training code.")

parser = argparse.ArgumentParser("train")

parser.add_argument("--input_data", type=str, help="input data")
parser.add_argument("--output_train", type=str, help="output_train directory")

# Argument check for remote debugging
parser.add_argument('--remote_debug', action='store_true')
parser.add_argument('--remote_debug_connection_timeout', type=int,
                    default=300,
                    help=f'Defines how much time the AML compute target '
                    f'will await a connection from a debugger client (VSCODE).')
parser.add_argument('--remote_debug_client_ip', type=str,
                    help=f'Defines IP Address of VS Code client')
parser.add_argument('--remote_debug_port', type=int,
                    default=5678,
                    help=f'Defines Port of VS Code client')

# Get run object, so we can find and log the IP of the host instance
global run
run = Run.get_context()

args = parser.parse_args()

# Start debugger if remote_debug is enabled
if args.remote_debug:
    print(f'Timeout for debug connection: {args.remote_debug_connection_timeout}')
    # Log the IP and port
    # ip = socket.gethostbyname(socket.gethostname())
    try:
        ip = args.remote_debug_client_ip
    except:
        print("Need to supply IP address for VS Code client")
    print(f'ip_address: {ip}')
    debugpy.listen(address=(ip, args.remote_debug_port))
    # Wait for the timeout for debugger to attach
    debugpy.wait_for_client()
    print(f'Debugger attached = {debugpy.is_client_connected()}')

print("Argument 1: %s" % args.input_data)
print("Argument 2: %s" % args.output_train)

if not (args.output_train is None):
    os.makedirs(args.output_train, exist_ok=True)
    print("%s created" % args.output_train)

配置 ML 管道Configure ML pipeline

若要提供所需的 Python 包来启动 debugpy 并获取运行上下文,请创建一个环境并设置 pip_packages=['debugpy', 'azureml-sdk==<SDK-VERSION>']To provide the Python packages needed to start debugpy and get the run context, create an environment and set pip_packages=['debugpy', 'azureml-sdk==<SDK-VERSION>']. 更改 SDK 版本,使之与当前使用的版本匹配。Change the SDK version to match the one you are using. 以下代码片段演示如何创建环境:The following code snippet demonstrates how to create an environment:

# Use a RunConfiguration to specify some additional requirements for this step.
from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.runconfig import DEFAULT_CPU_IMAGE

# create a new runconfig object
run_config = RunConfiguration()

# enable Docker 
run_config.environment.docker.enabled = True

# set Docker base image to the default CPU-based image
run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE

# use conda_dependencies.yml to create a conda environment in the Docker image for execution
run_config.environment.python.user_managed_dependencies = False

# specify CondaDependencies obj
run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],
                                                                           pip_packages=['debugpy', 'azureml-sdk==<SDK-VERSION>'])

配置 Python 脚本部分,已将新参数添加到 ML 管道步骤使用的脚本。In the Configure Python scripts section, new arguments were added to the scripts used by your ML pipeline steps. 以下代码片段演示如何使用这些参数来为组件启用调试并设置超时。The following code snippet demonstrates how to use these arguments to enable debugging for the component and set a timeout. 此外,演示如何使用前面通过设置 runconfig=run_config 创建的环境:It also demonstrates how to use the environment created earlier by setting runconfig=run_config:

# Use RunConfig from a pipeline step
step1 = PythonScriptStep(name="train_step",
                         script_name="train.py",
                         arguments=['--remote_debug', '--remote_debug_connection_timeout', 300,'--remote_debug_client_ip','<VS-CODE-CLIENT-IP>','--remote_debug_port',5678],
                         compute_target=aml_compute,
                         source_directory=source_directory,
                         runconfig=run_config,
                         allow_reuse=False)

当管道运行时,每个步骤将创建一个子运行。When the pipeline runs, each step creates a child run. 如果启用了调试,则修改后的脚本将在子运行的 70_driver_log.txt 中记录类似于以下文本的信息:If debugging is enabled, the modified script logs information similar to the following text in the 70_driver_log.txt for the child run:

Timeout for debug connection: 300
ip_address: 10.3.0.5

保存 ip_address 值。Save the ip_address value. 下一部分会用到它。It is used in the next section.

提示

还可以在此管道步骤的子运行的运行日志中找到 IP 地址。You can also find the IP address from the run logs for the child run for this pipeline step. 有关如何查看此信息的详细信息,请参阅监视 Azure ML 试验运行和指标For more information on viewing this information, see Monitor Azure ML experiment runs and metrics.

配置开发环境Configure development environment

  1. 若要在 VS Code 部署环境中安装 debugpy,请使用以下命令:To install debugpy on your VS Code development environment, use the following command:

    python -m pip install --upgrade debugpy
    

    有关结合使用 VS Code 和 debugpy 的详细信息,请参阅远程调试For more information on using debugpy with VS Code, see Remote Debugging.

  2. 若要配置 VS Code 以便与运行调试器的 Azure 机器学习计算进行通信,请创建新的调试配置:To configure VS Code to communicate with the Azure Machine Learning compute that is running the debugger, create a new debug configuration:

    1. 在 VS Code 中,选择“调试”菜单,然后选择“打开配置” 。From VS Code, select the Debug menu and then select Open configurations. 打开一个名为 launch.json 的文件。A file named launch.json opens.

    2. 在 launch.json 文件中,找到包含 "configurations": [ 的行,然后在其后插入以下文本:In the launch.json file, find the line that contains "configurations": [, and insert the following text after it. "host": "<IP-ADDRESS>" 项更改为在上一部分所述的、在日志中返回的 IP 地址。Change the "host": "<IP-ADDRESS>" entry to the IP address returned in your logs from the previous section. "localRoot": "${workspaceFolder}/code/step" 项更改为包含所调试脚本的副本的本地目录:Change the "localRoot": "${workspaceFolder}/code/step" entry to a local directory that contains a copy of the script being debugged:

      {
          "name": "Azure Machine Learning Compute: remote debug",
          "type": "python",
          "request": "attach",
          "port": 5678,
          "host": "<IP-ADDRESS>",
          "redirectOutput": true,
          "pathMappings": [
              {
                  "localRoot": "${workspaceFolder}/code/step1",
                  "remoteRoot": "."
              }
          ]
      }
      

      重要

      如果“配置”部分已存在其他项,请在插入的代码后添加一个逗号 (,)。If there are already other entries in the configurations section, add a comma (,) after the code that you inserted.

      提示

      最佳做法(尤其是对于管道)是将脚本的资源保留在不同的目录中,以便代码仅与每个步骤相关。The best practice, especially for pipelines is to keep the resources for scripts in separate directories so that code is relevant only for each of the steps. 在此示例中,localRoot 示例值引用 /code/step1In this example the localRoot example value references /code/step1.

      如果你正在调试多个脚本,请在不同的目录中为每个脚本创建一个单独的配置节。If you are debugging multiple scripts, in different directories, create a separate configuration section for each script.

    3. 保存 launch.json 文件。Save the launch.json file.

连接调试器Connect the debugger

  1. 打开 VS Code,然后打开脚本的本地副本。Open VS Code and open a local copy of the script.

  2. 设置断点,在附加调试器后,脚本将在这些断点处停止。Set breakpoints where you want the script to stop once you've attached.

  3. 当子进程正在运行脚本,并且 Timeout for debug connection 已显示在日志中时,请按 F5 键或选择“调试”。While the child process is running the script, and the Timeout for debug connection is displayed in the logs, use the F5 key or select Debug. 出现提示时,选择“Azure 机器学习计算: 远程调试”配置。When prompted, select the Azure Machine Learning Compute: remote debug configuration. 还可以从侧栏中选择“调试”图标,从“调试”下拉菜单中选择“Azure 机器学习: 远程调试”项,然后使用绿色箭头附加调试器。You can also select the debug icon from the side bar, the Azure Machine Learning: remote debug entry from the Debug dropdown menu, and then use the green arrow to attach the debugger.

    此时,VS Code 将连接到计算节点上的 debugpy,并在前面设置的断点处停止。At this point, VS Code connects to debugpy on the compute node and stops at the breakpoint you set previously. 现在可以在代码运行时逐句调试代码、查看变量等。You can now step through the code as it runs, view variables, etc.

    备注

    如果日志中显示的某个项指出 Debugger attached = False,则表示超时期限已过,而脚本在没有调试器的情况下继续运行。If the log displays an entry stating Debugger attached = False, then the timeout has expired and the script continued without the debugger. 再次提交管道,并在显示 Timeout for debug connection 消息之后、超时期限已过之前连接调试器。Submit the pipeline again and connect the debugger after the Timeout for debug connection message, and before the timeout expires.

对部署进行调试和故障排除Debug and troubleshoot deployments

某些情况下,可能需要以交互方式调试包含在模型部署中的 Python 代码。In some cases, you may need to interactively debug the Python code contained in your model deployment. 例如,如果输入脚本失败,并且无法通过其他记录确定原因。For example, if the entry script is failing and the reason cannot be determined by additional logging. 通过使用 VS Code 和 debugpy,可以附加到在 Docker 容器中运行的代码。By using VS Code and the debugpy, you can attach to the code running inside the Docker container.

重要

使用 Model.deploy()LocalWebservice.deploy_configuration 在本地部署模型时,此调试方法不起作用。This method of debugging does not work when using Model.deploy() and LocalWebservice.deploy_configuration to deploy a model locally. 相反,你必须使用 Model.package() 方法创建一个映像。Instead, you must create an image using the Model.package() method.

若要在本地部署 Web 服务,需要在本地系统上安装能够正常工作的 Docker。Local web service deployments require a working Docker installation on your local system. 有关使用 Docker 的详细信息,请参阅 Docker 文档For more information on using Docker, see the Docker Documentation. 请注意,在使用计算实例时,已安装 Docker。Note that when working with compute instances, Docker is already installed.

配置开发环境Configure development environment

  1. 若要在本地 VS Code 部署环境中安装 debugpy,请使用以下命令:To install debugpy on your local VS Code development environment, use the following command:

    python -m pip install --upgrade debugpy
    

    有关结合使用 VS Code 和 debugpy 的详细信息,请参阅远程调试For more information on using debugpy with VS Code, see Remote Debugging.

  2. 若要配置 VS Code,使其与 Docker 映像进行通信,请创建新的调试配置:To configure VS Code to communicate with the Docker image, create a new debug configuration:

    1. 在 VS Code 的“运行”扩展中,选择“调试”菜单,然后选择“打开配置” 。From VS Code, select the Debug menu in the Run extention and then select Open configurations. 打开一个名为 launch.json 的文件。A file named launch.json opens.

    2. 在 launch.json 文件中,找到“configurations”项(包含 "configurations": [ 的行),并且在其后插入以下文本。In the launch.json file, find the "configurations" item (the line that contains "configurations": [), and insert the following text after it.

      {
          "name": "Azure Machine Learning Deployment: Docker Debug",
          "type": "python",
          "request": "attach",
          "connect": {
              "port": 5678,
              "host": "0.0.0.0",
          },
          "pathMappings": [
              {
                  "localRoot": "${workspaceFolder}",
                  "remoteRoot": "/var/azureml-app"
              }
          ]
      }
      

      插入后,launch.json 文件应如下所示:After insertion, the launch.json file should be similar to the following:

      {
      // Use IntelliSense to learn about possible attributes.
      // Hover to view descriptions of existing attributes.
      // For more information, visit: https://go.microsoft.com/fwlink/linkid=830387
      "version": "0.2.0",
      "configurations": [
          {
              "name": "Python: Current File",
              "type": "python",
              "request": "launch",
              "program": "${file}",
              "console": "integratedTerminal"
          },
          {
              "name": "Azure Machine Learning Deployment: Docker Debug",
              "type": "python",
              "request": "attach",
              "connect": {
                  "port": 5678,
                  "host": "0.0.0.0"
                  },
              "pathMappings": [
                  {
                      "localRoot": "${workspaceFolder}",
                      "remoteRoot": "/var/azureml-app"
                  }
              ]
          }
          ]
      }
      

      重要

      如果 configurations 部分已存在其他条目,请在插入的代码后添加一个逗号 (,)。If there are already other entries in the configurations section, add a comma ( , ) after the code that you inserted.

      本部分使用端口 5678 附加到 Docker 容器。This section attaches to the Docker container using port 5678.

    3. 保存 launch.json 文件。Save the launch.json file.

创建包括 debugpy 的映像Create an image that includes debugpy

  1. 修改部署的 Conda 环境,使其包括 debugpy。Modify the conda environment for your deployment so that it includes debugpy. 以下示例演示使用 pip_packages 参数添加它的过程:The following example demonstrates adding it using the pip_packages parameter:

    from azureml.core.conda_dependencies import CondaDependencies 
    
    
    # Usually a good idea to choose specific version numbers
    # so training is made on same packages as scoring
    myenv = CondaDependencies.create(conda_packages=['numpy==1.15.4',
                                'scikit-learn==0.19.1', 'pandas==0.23.4'],
                                 pip_packages = ['azureml-defaults==1.0.83', 'debugpy'])
    
    with open("myenv.yml","w") as f:
        f.write(myenv.serialize_to_string())
    
  2. 若要在服务启动时启动 debugpy 并等待连接,请将以下内容添加到 score.py 文件的顶部:To start debugpy and wait for a connection when the service starts, add the following to the top of your score.py file:

    import debugpy
    # Allows other computers to attach to debugpy on this IP address and port.
    debugpy.listen(('0.0.0.0', 5678))
    # Wait 30 seconds for a debugger to attach. If none attaches, the script continues as normal.
    debugpy.wait_for_client()
    print("Debugger attached...")
    
  3. 基于环境定义创建一个映像,并将该映像提取到本地注册表。Create an image based on the environment definition and pull the image to the local registry.

    备注

    此示例假定 ws 指向 Azure 机器学习工作区,且 model 是要部署的模型。This example assumes that ws points to your Azure Machine Learning workspace, and that model is the model being deployed. myenv.yml 文件包含步骤 1 中创建的 Conda 依赖项。The myenv.yml file contains the conda dependencies created in step 1.

    from azureml.core.conda_dependencies import CondaDependencies
    from azureml.core.model import InferenceConfig
    from azureml.core.environment import Environment
    
    
    myenv = Environment.from_conda_specification(name="env", file_path="myenv.yml")
    myenv.docker.base_image = None
    myenv.docker.base_dockerfile = "FROM mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04"
    inference_config = InferenceConfig(entry_script="score.py", environment=myenv)
    package = Model.package(ws, [model], inference_config)
    package.wait_for_creation(show_output=True)  # Or show_output=False to hide the Docker build logs.
    package.pull()
    

    创建并下载映像(此过程花费的时间可能超过 10 分钟,因此请耐心等待)后,映像路径(包括存储库、名称和标记,在此示例中也是摘要)会显示在类似于以下内容的消息中:Once the image has been created and downloaded (this process may take more than 10 minutes, so please wait patiently), the image path (includes repository, name, and tag, which in this case is also its digest) is finally displayed in a message similar to the following:

    Status: Downloaded newer image for myregistry.azurecr.io/package@sha256:<image-digest>
    
  4. 若要使得在本地使用映像更加容易,可使用以下命令为此映像添加标记。To make it easier to work with the image locally, you can use the following command to add a tag for this image. 将以下命令中的 myimagepath 替换为前面步骤中的位置值。Replace myimagepath in the following command with the location value from the previous step.

    docker tag myimagepath debug:1
    

    对于其余步骤,可以将本地映像作为 debug:1 而不是完整的映像路径值来进行引用。For the rest of the steps, you can refer to the local image as debug:1 instead of the full image path value.

调试服务Debug the service

提示

如果在 score.py 文件中为 debugpy 连接设置超时,则必须在超时到达之前将 VS Code 连接到调试会话。If you set a timeout for the debugpy connection in the score.py file, you must connect VS Code to the debug session before the timeout expires. 启动 VS Code,打开 score.py 的本地副本,设置一个断点,使其准备就绪,然后再使用本部分中的步骤进行操作。Start VS Code, open the local copy of score.py, set a breakpoint, and have it ready to go before using the steps in this section.

有关调试和设置断点的详细信息,请参阅调试For more information on debugging and setting breakpoints, see Debugging.

  1. 若要使用映像启动 Docker 容器,请使用以下命令:To start a Docker container using the image, use the following command:

    docker run -it --name debug -p 8000:5001 -p 5678:5678 -v <my_local_path_to_score.py>:/var/azureml-app/score.py debug:1 /bin/bash
    

    这会将 score.py 本地附加到容器中的对应项。This attaches your score.py locally to the one in the container. 因此,在编辑器中所做的任何更改都将自动反映到容器中。Therefore, any changes made in the editor are automatically reflected in the container.

  2. 为了获得更好的体验,可以使用新的 VS Code 界面进入容器。For a better experience, you can go into the container with a new VS code interface. 从 VS Code 侧栏中选择 Docker 扩展,找到已创建的本地容器(在本文档中为 debug:1)。Select the Docker extention from the VS Code side bar, find your local container created, in this documentation it's debug:1. 右键单击此容器并选择 "Attach Visual Studio Code",这时将自动打开一个新的 VS Code 界面,该界面将显示已创建的容器内部。Right-click this container and select "Attach Visual Studio Code", then a new VS Code interface will be opened automatically, and this interface shows the inside of your created container.

    容器 VS Code 界面

  3. 在容器内,在 shell 中运行以下命令Inside the container, run the following command in the shell

    runsvdir /var/runit
    

    然后,可以在容器内的 shell 查看以下输出:Then you can see the following output in the shell inside your container:

    容器运行控制台输出

  4. 若要将 VS Code 附加到容器中的 debugpy,请打开 VS Code 并按 F5 或选择“调试”。To attach VS Code to debugpy inside the container, open VS Code and use the F5 key or select Debug. 出现提示时,请选择“Azure 机器学习部署: Docker 调试”配置。When prompted, select the Azure Machine Learning Deployment: Docker Debug configuration. 还可以从侧栏中选择“运行”扩展图标,即“Azure 机器学习部署: Docker 调试”项(位于“调试”下拉菜单),然后使用绿色箭头附加调试器。You can also select the Run extention icon from the side bar, the Azure Machine Learning Deployment: Docker Debug entry from the Debug dropdown menu, and then use the green arrow to attach the debugger.

    “调试”图标、“启动调试”按钮和“配置”选择器

    单击绿色箭头并附加调试器后,可以在容器 VS Code 界面中查看一些新信息:After clicking the green arrow and attaching the debugger, in the container VS Code interface you can see some new information:

    “容器调试器已附加”信息

    此外,在主 VS Code 界面中,可以看到以下内容:Also, in your main VS Code interface, what you can see is following:

    Score.py 中的 VS Code 断点

现在,附加到容器的本地 score.py 已在你设置的断点处停止。And now, the local score.py which is attached to the container has already stopped at the breakpoints where you set. 此时,VS Code 会连接到 Docker 容器内的 debugpy,并在之前设置的断点处停止 Docker 容器。At this point, VS Code connects to debugpy inside the Docker container and stops the Docker container at the breakpoint you set previously. 现在可以在代码运行时逐句调试代码、查看变量等。You can now step through the code as it runs, view variables, etc.

有关使用 VS Code 调试 Python 的详细信息,请参阅调试 Python 代码For more information on using VS Code to debug Python, see Debug your Python code.

停止容器Stop the container

若要停止容器,请使用以下命令:To stop the container, use the following command:

docker stop debug

后续步骤Next steps

现在,你已设置 VS Code Remote,可以将计算实例用作 VS Code 中的远程计算,从而对代码进行交互式调试。Now that you've set up VS Code Remote, you can use a compute instance as remote compute from VS Code to interactively debug your code.

详细了解故障排除:Learn more about troubleshooting: