调用计算机视觉 APICall the Computer Vision API

本文演示如何使用 REST API 调用计算机视觉 API。This article demonstrates how to call the Computer Vision API by using the REST API. 示例是使用计算机视觉 API 客户端库以 C# 语言作为 HTTP POST/GET 调用编写的。The samples are written both in C# by using the Computer Vision API client library and as HTTP POST or GET calls. 本文重点介绍:The article focuses on:

  • 获取标记、说明和类别Getting tags, a description, and categories
  • 获取特定于域的信息(或“名人”)Getting domain-specific information, or "celebrities"

本文中的示例演示以下功能:The examples in this article demonstrate the following features:

  • 分析图像以返回标记数组和说明Analyzing an image to return an array of tags and a description
  • 使用特定于域的模型(具体而言,是“名人”模型)分析图像并以 JSON 格式返回相应结果Analyzing an image with a domain-specific model (specifically, the "celebrities" model) to return the corresponding result in JSON

这些功能提供以下选项:The features offer the following options:

  • 选项 1:范围分析 - 仅分析指定的模型Option 1: Scoped Analysis - Analyze only a specified model
  • 选项 2:增强分析 - 使用 86 类别分类法进行分析,以提供更多详细信息Option 2: Enhanced Analysis - Analyze to provide additional details by using 86-categories taxonomy

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 本地存储的图像的图像 URL 或路径An image URL or a path to a locally stored image
  • 支持的输入方法:原始图像二进制,采用应用程序/八位字节流或图像 URL 的形式Supported input methods: a raw image binary in the form of an application/octet-stream, or an image URL
  • 支持的图像文件格式:JPEG、PNG、GIF 和 BMPSupported image file formats: JPEG, PNG, GIF, and BMP
  • 图像文件大小:4 MB 或更小Image file size: 4 MB or less
  • 图像尺寸:50 × 50 像素或以上Image dimensions: 50 × 50 pixels or greater

授权 API 调用Authorize the API call

每次调用计算机视觉 API 都需要订阅密钥。Every call to the Computer Vision API requires a subscription key. 此密钥必须通过查询字符串参数传递,或在请求标头中指定。This key must be either passed through a query string parameter or specified in the request header.

可通过以下任一操作传递订阅密钥:You can pass the subscription key by doing any of the following:

  • 通过查询字符串传递,如以下计算机视觉 API 示例所示:Pass it through a query string, as in this Computer Vision API example:

    https://api.cognitive.azure.cn/vision/v2.1/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription key>
    
  • 在 HTTP 请求标头中指定:Specify it in the HTTP request header:

    ocp-apim-subscription-key: <Your subscription key>
    
  • 使用客户端库时,请通过 ComputerVisionClient 的构造函数传递密钥,并在客户端的某个属性中指定区域:When you use the client library, pass the key through the constructor of ComputerVisionClient, and specify the region in a property of the client:

    var visionClient = new ComputerVisionClient(new ApiKeyServiceClientCredentials("Your subscriptionKey"))
    {
        Endpoint = "https://api.cognitive.azure.cn"
    }
    

将图像上传到计算机视觉 API 服务Upload an image to the Computer Vision API service

执行计算机视觉 API 调用的基本方法是直接上传图像以返回标记、说明和名人。The basic way to perform the Computer Vision API call is by uploading an image directly to return tags, a description, and celebrities. 为此,可以发送一个“POST”请求并在其 HTTP 正文中包含二进制图像,另外包含从图像读取的数据。You do this by sending a "POST" request with the binary image in the HTTP body together with the data read from the image. 所有计算机视觉 API 调用的上传方法相同。The upload method is the same for all Computer Vision API calls. 唯一的差别在于指定的查询参数。The only difference is the query parameters that you specify.

对于指定的图像,使用以下选项之一获取标记和说明:For a specified image, get tags and a description by using either of the following options:

选项 1:获取标记列表和说明Option 1: Get a list of tags and a description

POST https://api.cognitive.azure.cn/vision/v2.1/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription key>
using System.IO;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;

ImageAnalysis imageAnalysis;
var features = new VisualFeatureTypes[] { VisualFeatureTypes.Tags, VisualFeatureTypes.Description };

using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))
{
  imageAnalysis = await visionClient.AnalyzeImageInStreamAsync(fs, features);
}

选项 2:仅获取标记列表,或仅获取说明Option 2: Get a list of tags only or a description only

(仅对于标记)运行:For tags only, run:

POST https://api.cognitive.azure.cn/vision/v2.1/tag?subscription-key=<Your subscription key>
var tagResults = await visionClient.TagImageAsync("http://contoso.com/example.jpg");

(仅对于说明)运行:For a description only, run:

POST https://api.cognitive.azure.cn/vision/v2.1/describe?subscription-key=<Your subscription key>
using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))
{
  imageDescription = await visionClient.DescribeImageInStreamAsync(fs);
}

获取特定领域的分析(名人)Get domain-specific analysis (celebrities)

选项 1:范围分析 - 仅分析指定的模型Option 1: Scoped analysis - Analyze only a specified model

POST https://api.cognitive.azure.cn/vision/v2.1/models/celebrities/analyze
var celebritiesResult = await visionClient.AnalyzeImageInDomainAsync(url, "celebrities");

对于此选项,所有其他查询参数 {visualFeatures, details} 无效。For this option, all other query parameters {visualFeatures, details} are not valid. 如果要查看所有支持的模型,请使用:If you want to see all supported models, use:

GET https://api.cognitive.azure.cn/vision/v2.1/models 
var models = await visionClient.ListModelsAsync();

选项 2:增强分析 - 使用 86 类别分类法进行分析,以提供更多详细信息Option 2: Enhanced analysis - Analyze to provide additional details by using 86-categories taxonomy

对于你要从一个或多个特定于域的模型获取常规图像分析结果以及详细信息的应用程序,请使用模型查询参数扩展 v1 API。For applications where you want to get a generic image analysis in addition to details from one or more domain-specific models, extend the v1 API by using the models query parameter.

POST https://api.cognitive.azure.cn/vision/v2.1/analyze?details=celebrities

调用此方法时,请先调用 86 类别分类器。When you invoke this method, you first call the 86-category classifier. 如果任一类别与某个已知或匹配模型的类别匹配,将执行第二轮分类器调用。If any of the categories matches that of a known or matching model, a second pass of classifier invocations occurs. 例如,如果“details=all”,或者“details”包括“celebrities”,则会在调用 86 类别分类器后调用 celebrities 模型。For example, if "details=all" or "details" includes "celebrities," you call the celebrities model after you call the 86-category classifier. 结果包含人员类别。The result includes the category person. 相比“选项 1”,此方法会增大对名人感兴趣的用户所遇到的延迟In contrast with Option 1, this method increases latency for users who are interested in celebrities.

在这种情况下,所有 v1 查询参数的行为相同。In this case, all v1 query parameters behave in the same way. 如果未指定 visualFeatures=categories,会隐式启用它。If you don't specify visualFeatures=categories, it's implicitly enabled.

检索并了解 JSON 输出以进行分析Retrieve and understand the JSON output for analysis

下面是一个示例:Here's an example:

{  
  "tags":[  
    {  
      "name":"outdoor",
      "score":0.976
    },
    {  
      "name":"bird",
      "score":0.95
    }
  ],
  "description":{  
    "tags":[  
      "outdoor",
      "bird"
    ],
    "captions":[  
      {  
        "text":"partridge in a pear tree",
        "confidence":0.96
      }
    ]
  }
}
字段Field 类型Type 内容Content
TagsTags object 标记数组的顶级对象。The top-level object for an array of tags.
tags[].Nametags[].Name string 标记分类器中的关键字。The keyword from the tags classifier.
tags[].Scoretags[].Score number 置信度评分,介于 0 和 1 之间。The confidence score, between 0 and 1.
descriptiondescription object 说明的顶级对象。The top-level object for a description.
description.tags[]description.tags[] string 标记列表。The list of tags. 如果置信度不足,因此无法生成标题,则标记可能是可供调用方使用的唯一信息。If there is insufficient confidence in the ability to produce a caption, the tags might be the only information available to the caller.
description.captions[].textdescription.captions[].text string 描述图像的短语。A phrase describing the image.
description.captions[].confidencedescription.captions[].confidence number 短语的置信度评分。The confidence score for the phrase.

检索并了解特定于域的模型的 JSON 输出Retrieve and understand the JSON output of domain-specific models

选项 1:范围分析 - 仅分析指定的模型Option 1: Scoped analysis - Analyze only a specified model

输出是一个标记数组,如以下示例中所示:The output is an array of tags, as shown in the following example:

{  
  "result":[  
    {  
      "name":"golden retriever",
      "score":0.98
    },
    {  
      "name":"Labrador retriever",
      "score":0.78
    }
  ]
}

选项 2:增强分析 - 使用“86 类别”分类法进行分析,以提供更多详细信息Option 2: Enhanced analysis - Analyze to provide additional details by using the "86-categories" taxonomy

对于使用“选项 2”(增强分析)的特定于域的模型,类别返回类型将会扩展,如以下示例中所示:For domain-specific models using Option 2 (enhanced analysis), the categories return type is extended, as shown in the following example:

{  
  "requestId":"87e44580-925a-49c8-b661-d1c54d1b83b5",
  "metadata":{  
    "width":640,
    "height":430,
    "format":"Jpeg"
  },
  "result":{  
    "celebrities":[  
      {  
        "name":"Richard Nixon",
        "faceRectangle":{  
          "left":107,
          "top":98,
          "width":165,
          "height":165
        },
        "confidence":0.9999827
      }
    ]
  }
}

类别字段是原始分类法中一个或多个 86 类别的列表。The categories field is a list of one or more of the 86 categories in the original taxonomy. 以下划线结尾的类别将匹配该类别及其子级(例如,在 celebrities 模型中匹配“people_”或“people_group”)。Categories that end in an underscore match that category and its children (for example, "people_" or "people_group," for the celebrities model).

字段Field 类型Type 内容Content
categoriescategories object 顶级对象The top-level object.
categories[].namecategories[].name string 86 类别分类法列表中的名称。The name from the 86-category taxonomy list.
categories[].scorecategories[].score number 置信度评分,介于 0 和 1 之间。The confidence score, between 0 and 1.
categories[].detailcategories[].detail object? (可选)详细信息对象。(Optional) The detail object.

如果多个类别匹配(例如,86 类别分类器在 model=celebrities 时同时返回“people_”和“people_young”的评分),则详细信息将附加到最宽泛级别的匹配项(在该示例中为“people_”)。If multiple categories match (for example, the 86-category classifier returns a score for both "people_" and "people_young," when model=celebrities), the details are attached to the most general level match ("people_," in that example).

错误响应Error responses

这些错误与 vision.analyze 中的错误相同,另外还会出现 NotSupportedModel 错误 (HTTP 400),该错误在使用“选项 1”和“选项 2”时都可能会返回。These errors are identical to those in vision.analyze, with the additional NotSupportedModel error (HTTP 400), which might be returned in both the Option 1 and Option 2 scenarios. 对于“选项 2”(增强分析),如果无法识别详细信息中指定的任何模型,则即使一个或多个模型有效,API 也会返回 NotSupportedModel。For Option 2 (enhanced analysis), if any of the models that are specified in the details isn't recognized, the API returns a NotSupportedModel, even if one or more of them are valid. 若要确定哪些模型受支持,可以调用 listModels。To find out what models are supported, you can call listModels.