CLI (v2) 自动化 ML 图像对象检测作业 YAML 架构

适用于:Azure CLI ml 扩展 v2(当前)

源 JSON 架构可在 https://azuremlsdk2.blob.core.chinacloudapi.cn/preview/0.0.1/autoMLImageObjectDetectionJob.schema.json 中找到。

注意

本文档中详细介绍的 YAML 语法基于最新版本的 ML CLI v2 扩展的 JSON 架构。 此语法必定仅适用于最新版本的 ML CLI v2 扩展。 可以在 https://azuremlschemasprod.azureedge.net/ 上查找早期扩展版本的架构。

YAML 语法

有关 YAML 语法中所有键的信息,请参阅图像分类任务的 YAML 语法。 在这里,我们只描述那些其值不同于为图像分类任务指定的值的键。

密钥 类型 说明 允许的值 默认值
task const 必需。 AutoML 任务的类型。 image_object_detection image_object_detection
primary_metric 字符串 将由 AutoML 针对模型选择进行优化的指标。 mean_average_precision mean_average_precision
training_parameters 对象 字典,包含作业的训练参数。 提供一个对象,该对象具有以下部分列出的键。
- yolov5 模型特定的超参数(如果使用 yolov5 进行对象检测)
- 与模型无关的超参数
- 对象检测和实例分段任务特定的超参数

有关示例,请参阅支持的模型体系结构部分。

备注

az ml job 命令可用于管理 Azure 机器学习作业。

示例

示例 GitHub 存储库中提供了示例。 下面显示了与图像对象检测作业相关的示例。

YAML:AutoML 图像对象检测作业

$schema: https://azuremlsdk2.blob.core.chinacloudapi.cn/preview/0.0.1/autoMLJob.schema.json
type: automl

# <experiment_name>
experiment_name: dpv2-cli-automl-image-object-detection-experiment
# </experiment_name>
description: An Image Object Detection job using fridge items dataset

# <compute_settings>
compute: azureml:gpu-cluster
# </compute_settings>

# <task_settings>
task: image_object_detection
log_verbosity: debug
primary_metric: mean_average_precision
# </task_settings>

# <mltable_settings>
target_column_name: label
training_data:
  # Update the path, if prepare_data.py is using data_path other than "./data"
  path: data/training-mltable-folder
  type: mltable
validation_data:
  # Update the path, if prepare_data.py is using data_path other than "./data"
  path: data/validation-mltable-folder
  type: mltable
# </mltable_settings>

# <limit_settings>
limits:
  timeout_minutes: 60
  max_trials: 10
  max_concurrent_trials: 2
# </limit_settings>

# <fixed_settings>
training_parameters:
  early_stopping: True
  evaluation_frequency: 1
# </fixed_settings>

# <sweep_settings>
sweep:
  sampling_algorithm: random
  early_termination:
    type: bandit
    evaluation_interval: 2
    slack_factor: 0.2
    delay_evaluation: 6
# </sweep_settings>

# <search_space_settings>
search_space:
  - model_name:
      type: choice
      values: [yolov5]
    learning_rate:
      type: uniform
      min_value: 0.0001
      max_value: 0.01
    model_size:
      type: choice
      values: ['small', 'medium']
  - model_name:
      type: choice
      values: [fasterrcnn_resnet50_fpn]
    learning_rate:
      type: uniform
      min_value: 0.0001
      max_value: 0.001
    optimizer:
      type: choice
      values: ['sgd', 'adam', 'adamw']
    min_size:
      type: choice
      values: [600, 800]
# </search_space_settings>

YAML:AutoML 图像对象检测管道作业

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

description: Pipeline using AutoML Image Object Detection task

display_name: pipeline-with-image-object-detection
experiment_name: pipeline-with-automl

settings:
  default_compute: azureml:gpu-cluster

inputs:
  image_object_detection_training_data:
    type: mltable
    # Update the path, if prepare_data.py is using data_path other than "./data"
    path: data/training-mltable-folder
  image_object_detection_validation_data:
    type: mltable
    # Update the path, if prepare_data.py is using data_path other than "./data"
    path: data/validation-mltable-folder

jobs:
  image_object_detection_node:
    type: automl
    task: image_object_detection
    log_verbosity: info
    primary_metric: mean_average_precision
    limits:
      timeout_minutes: 180
      max_trials: 10
      max_concurrent_trials: 2
    target_column_name: label
    training_data: ${{parent.inputs.image_object_detection_training_data}}
    validation_data: ${{parent.inputs.image_object_detection_validation_data}}
    training_parameters:
      early_stopping: True
      evaluation_frequency: 1
    sweep:
      sampling_algorithm: random
      early_termination:
        type: bandit
        evaluation_interval: 2
        slack_factor: 0.2
        delay_evaluation: 6
    search_space:
      - model_name:
          type: choice
          values: [yolov5]
        learning_rate:
          type: uniform
          min_value: 0.0001
          max_value: 0.001
        model_size:
          type: choice
          values: ['small', 'medium']

      - model_name:
          type: choice
          values: [fasterrcnn_resnet50_fpn]
        learning_rate:
          type: uniform
          min_value: 0.0001
          max_value: 0.001
        optimizer:
          type: choice
          values: ['sgd', 'adam', 'adamw']
        min_size:
          type: choice
          values: [600, 800]
    # currently need to specify outputs "mlflow_model" explicitly to reference it in following nodes
    outputs:
      best_model:
        type: mlflow_model
  register_model_node:
    type: command
    component: file:./components/component_register_model.yaml
    inputs:
      model_input_path: ${{parent.jobs.image_object_detection_node.outputs.best_model}}
      model_base_name: fridge_items_object_detection_model
      

后续步骤