图像分析认知技能Image Analysis cognitive skill

图像分析技能根据图像内容提取一组丰富的可视特征 。The Image Analysis skill extracts a rich set of visual features based on the image content. 例如,可从图像生成标题栏、生成标记或识别名人和地标。For example, you can generate a caption from an image, generate tags, or identify celebrities and landmarks. 此技能使用认知服务中的计算机视觉提供的机器学习模型。This skill uses the machine learning models provided by Computer Vision in Cognitive Services.

备注

如果事务量较少(不到 20 个事务),则这些事务可以在 Azure 认知搜索中免费执行,但较大的工作负荷需要附加可计费认知服务资源Small volumes (under 20 transactions) can be executed for free in Azure Cognitive Search, but larger workloads require attaching a billable Cognitive Services resource. 调用认知服务中的 API 以及在 Azure 认知搜索中的文档破解阶段提取图像时,会产生费用。Charges accrue when calling APIs in Cognitive Services, and for image extraction as part of the document-cracking stage in Azure Cognitive Search. 提取文档中的文本不会产生费用。There are no charges for text extraction from documents.

当你行使内置技能时,我们会按现有的认知服务预付费价格收费。Execution of built-in skills is charged at the existing Cognitive Services pay-in-advance price. 图像提取定价如 Azure 认知搜索定价页所述。Image extraction pricing is described on the Azure Cognitive Search pricing page.

@odata.type

Microsoft.Skills.Vision.ImageAnalysisSkillMicrosoft.Skills.Vision.ImageAnalysisSkill

技能参数Skill parameters

参数区分大小写。Parameters are case-sensitive.

参数名称Parameter name 说明Description
defaultLanguageCode 表示要返回的语言的字符串。A string indicating the language to return. 该服务以指定的语言返回识别结果。The service returns recognition results in a specified language. 如果未指定此属性,则默认值为“en”。If this parameter is not specified, the default value is "en".

支持的语言为:Supported languages are:
en - 英语(默认)en - English (default)
es - 西班牙语es - Spanish
ja - 日语ja - Japanese
pt - 葡萄牙语pt - Portuguese
zh - 简体中文zh - Simplified Chinese
visualFeatures 表示要返回的可视特征类型的一组字符串。An array of strings indicating the visual feature types to return. 有效的可视特征类型包括:Valid visual feature types include:
  • adult - 检测图片是否具有色情性质(描绘裸体或性行为),以及是否具有血腥内容(描绘极端暴力或血腥)。adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). 还会检测性暗示内容(也称为不雅内容)。Sexually suggestive content (also known as racy content) is also detected.
  • brands - 检测图像中的各种品牌,包括大致位置。brands - detects various brands within an image, including the approximate location. brands 视觉功能仅在英文版本中提供。The brands visual feature is only available in English.
  • categories - 根据认知服务计算机视觉文档中定义的分类对图像内容进行分类。categories - categorizes image content according to a taxonomy defined in the Cognitive Services Computer Vision documentation.
  • description - 用受支持的语言以完整的句子描述图像内容。description - describes the image content with a complete sentence in supported languages.
  • faces - 检测人脸是否存在。faces - detects if faces are present. 如果存在,则生成坐标、性别和年龄。If present, generates coordinates, gender and age.
  • objects - 检测图像中的各种对象,包括大致位置。objects - detects various objects within an image, including the approximate location. objects 视觉功能仅在英文版本中提供。The objects visual feature is only available in English.
  • tags - 使用与图像内容相关字词的详细列表来标记图像。tags - tags the image with a detailed list of words related to the image content.
可视特征的名称区分大小写。Names of visual features are case-sensitive. 请注意,color 和 imageType 视觉特征已弃用,但仍可通过自定义技能来访问此功能。Note that the color and imageType visual features have been deprecated, but this functionality could still be accessed via a custom skill.
details 表示要返回的特定于域的详细信息的一组字符串。An array of strings indicating which domain-specific details to return. 有效的可视特征类型包括:Valid visual feature types include:
  • celebrities - 识别在图像中检测到的名人。celebrities - identifies celebrities if detected in the image.
  • landmarks - 识别在图像中检测到的地标。landmarks - identifies landmarks if detected in the image.

技能输入Skill inputs

输入名称Input name 说明Description
image 复杂类型。Complex Type. 当前仅适用于“/document/normalized_images”字段,当 imageAction 设置为非 none 值时由 Azure Blob 索引器生成。Currently only works with "/document/normalized_images" field, produced by the Azure Blob indexer when imageAction is set to a value other than none. 请参阅此示例获取详细信息。See the sample for more information.

示例技能定义Sample skill definition

        {
            "description": "Extract image analysis.",
            "@odata.type": "#Microsoft.Skills.Vision.ImageAnalysisSkill",
            "context": "/document/normalized_images/*",
            "defaultLanguageCode": "en",
            "visualFeatures": [
                "tags",
                "categories",
                "description",
                "faces",
                "brands"
            ],
            "inputs": [
                {
                    "name": "image",
                    "source": "/document/normalized_images/*"
                }
            ],
            "outputs": [
                {
                    "name": "categories"
                },
                {
                    "name": "tags"
                },
                {
                    "name": "description"
                },
                {
                    "name": "faces"
                },
                {
                    "name": "brands"
                }
            ]
        }

示例索引(仅适用于类别、说明、人脸和标记字段)Sample index (for only the categories, description, faces and tags fields)

{
    "fields": [
        {
            "name": "id",
            "type": "Edm.String",
            "key": true,
            "searchable": true,
            "filterable": false,
            "facetable": false,
            "sortable": true
        },
        {
            "name": "blob_uri",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "facetable": false,
            "sortable": true
        },
        {
            "name": "content",
            "type": "Edm.String",
            "sortable": false,
            "searchable": true,
            "filterable": false,
            "facetable": false
        },
        {
            "name": "categories",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "name",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "score",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "detail",
                    "type": "Edm.ComplexType",
                    "fields": [
                        {
                            "name": "celebrities",
                            "type": "Collection(Edm.ComplexType)",
                            "fields": [
                                {
                                    "name": "name",
                                    "type": "Edm.String",
                                    "searchable": true,
                                    "filterable": false,
                                    "facetable": false
                                },
                                {
                                    "name": "faceBoundingBox",
                                    "type": "Collection(Edm.ComplexType)",
                                    "fields": [
                                        {
                                            "name": "x",
                                            "type": "Edm.Int32",
                                            "searchable": false,
                                            "filterable": false,
                                            "facetable": false
                                        },
                                        {
                                            "name": "y",
                                            "type": "Edm.Int32",
                                            "searchable": false,
                                            "filterable": false,
                                            "facetable": false
                                        }
                                    ]
                                },
                                {
                                    "name": "confidence",
                                    "type": "Edm.Double",
                                    "searchable": false,
                                    "filterable": false,
                                    "facetable": false
                                }
                            ]
                        },
                        {
                            "name": "landmarks",
                            "type": "Collection(Edm.ComplexType)",
                            "fields": [
                                {
                                    "name": "name",
                                    "type": "Edm.String",
                                    "searchable": true,
                                    "filterable": false,
                                    "facetable": false
                                },
                                {
                                    "name": "confidence",
                                    "type": "Edm.Double",
                                    "searchable": false,
                                    "filterable": false,
                                    "facetable": false
                                }
                            ]
                        }
                    ]
                }
            ]
        },
        {
            "name": "description",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "tags",
                    "type": "Collection(Edm.String)",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "captions",
                    "type": "Collection(Edm.ComplexType)",
                    "fields": [
                        {
                            "name": "text",
                            "type": "Edm.String",
                            "searchable": true,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "confidence",
                            "type": "Edm.Double",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "faces",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "age",
                    "type": "Edm.Int32",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "gender",
                    "type": "Edm.String",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "faceBoundingBox",
                    "type": "Collection(Edm.ComplexType)",
                    "fields": [
                        {
                            "name": "x",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "y",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "tags",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "name",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "confidence",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                }
            ]
        }
    ]
}

示例输出字段映射(适用于上述索引)Sample output field mapping (for the above index)

    "outputFieldMappings": [
        {
            "sourceFieldName": "/document/normalized_images/*/categories/*",
            "targetFieldName": "categories"
        },
        {
            "sourceFieldName": "/document/normalized_images/*/tags/*",
            "targetFieldName": "tags"
        },
        {
            "sourceFieldName": "/document/normalized_images/*/description",
            "targetFieldName": "description"
        },
        {
            "sourceFieldName": "/document/normalized_images/*/faces/*",
            "targetFieldName": "faces"
        },
        {
            "sourceFieldName": "/document/normalized_images/*/brands/*/name",
            "targetFieldName": "brands"
        }

输出字段映射的变体(嵌套属性)Variation on output field mappings (nested properties)

可以将输出字段映射定义为较低级别的属性,例如仅地标或名人。You can define output field mappings to lower-level properties, such as just landmarks or celebrities. 在这种情况下,请确保索引架构有一个字段专门包含地标。In this case, make sure your index schema has a field to contain landmarks specifically.

    "outputFieldMappings": [
        {
            "sourceFieldName": "/document/normalized_images/*/categories/detail/celebrities/*",
            "targetFieldName": "celebrities"
        }

示例输入Sample input

{
    "values": [
        {
            "recordId": "1",
            "data": {
                "image": {
                    "data": "BASE64 ENCODED STRING OF A JPEG IMAGE",
                    "width": 500,
                    "height": 300,
                    "originalWidth": 5000,
                    "originalHeight": 3000,
                    "rotationFromOriginal": 90,
                    "contentOffset": 500,
                    "pageNumber": 2
                }
            }
        }
    ]
}

示例输出Sample output

{
  "values": [
    {
      "recordId": "1",
      "data": {
        "categories": [
          {
            "name": "abstract_",
            "score": 0.00390625
          },
          {
            "name": "people_",
            "score": 0.83984375,
            "detail": {
              "celebrities": [
                {
                  "name": "Satya Nadella",
                  "faceBoundingBox": [
                        {
                            "x": 273,
                            "y": 309
                        },
                        {
                            "x": 395,
                            "y": 309
                        },
                        {
                            "x": 395,
                            "y": 431
                        },
                        {
                            "x": 273,
                            "y": 431
                        }
                    ],
                  "confidence": 0.999028444
                }
              ],
              "landmarks": [
                {
                  "name": "Forbidden City",
                  "confidence": 0.9978346
                }
              ]
            }
          }
        ],
        "adult": {
          "isAdultContent": false,
          "isRacyContent": false,
          "isGoryContent": false,
          "adultScore": 0.0934349000453949,
          "racyScore": 0.068613491952419281,
          "goreScore": 0.08928389008070282
        },
        "tags": [
          {
            "name": "person",
            "confidence": 0.98979085683822632
          },
          {
            "name": "man",
            "confidence": 0.94493889808654785
          },
          {
            "name": "outdoor",
            "confidence": 0.938492476940155
          },
          {
            "name": "window",
            "confidence": 0.89513939619064331
          }
        ],
        "description": {
          "tags": [
            "person",
            "man",
            "outdoor",
            "window",
            "glasses"
          ],
          "captions": [
            {
              "text": "Satya Nadella sitting on a bench",
              "confidence": 0.48293603002174407
            }
          ]
        },
        "requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
        "metadata": {
          "width": 1500,
          "height": 1000,
          "format": "Jpeg"
        },
        "faces": [
          {
            "age": 44,
            "gender": "Male",
            "faceBoundingBox": [
                {
                    "x": 1601,
                    "y": 395
                },
                {
                    "x": 1653,
                    "y": 395
                },
                {
                    "x": 1653,
                    "y": 447
                },
                {
                    "x": 1601,
                    "y": 447
                }
            ]
          }
        ],
        "objects": [
          {
            "rectangle": {
              "x": 25,
              "y": 43,
              "w": 172,
              "h": 140
            },
            "object": "person",
            "confidence": 0.931
          }
        ],
        "brands":[  
           {  
              "name":"Microsoft",
              "confidence": 0.903,
              "rectangle":{  
                 "x":20,
                 "y":97,
                 "w":62,
                 "h":52
              }
           }
        ]
      }
    }
  ]
}

错误案例Error cases

在以下错误案例中,未提取任何元素。In the following error cases, no elements are extracted.

错误代码Error Code 说明Description
NotSupportedLanguage 不支持提供的语言。The language provided is not supported.
InvalidImageUrl 图片 URL 格式不正确或无法访问。Image URL is badly formatted or not accessible.
InvalidImageFormat 输入数据不是有效的图像。Input data is not a valid image.
InvalidImageSize 输入的图像太大。Input image is too large.
NotSupportedVisualFeature 指定的特征类型无效。Specified feature type is not valid.
NotSupportedImage 不受支持的图片,例如儿童色情内容。Unsupported image, for example, child pornography.
InvalidDetails 不受支持的特定于域的模型。Unsupported domain-specific model.

如果收到类似于 "One or more skills are invalid. Details: Error in skill #<num>: Outputs are not supported by skill: Landmarks" 的错误,请检查路径。If you get the error similar to "One or more skills are invalid. Details: Error in skill #<num>: Outputs are not supported by skill: Landmarks", check the path. 名人和地标都是 detail 下的属性。Both celebrities and landmarks are properties under detail.

"categories":[  
      {  
         "name":"building_",
         "score":0.97265625,
         "detail":{  
            "landmarks":[  
               {  
                  "name":"Forbidden City",
                  "confidence":0.92013400793075562
               }
            ]

另请参阅See also