示例:如何调用计算机视觉 APIExample: How to call the Computer Vision API

本指南演示如何使用 REST 调用计算机视觉 API。This guide demonstrates how to call Computer Vision API using REST. 这些示例是使用计算机视觉 API 客户端库以 C# 编写的,也是作为 HTTP POST/GET 调用编写的。The samples are written both in C# using the Computer Vision API client library, and as HTTP POST/GET calls. 我们将重点介绍:We will focus on:

  • 如何获取 "Tags"、"Description" 和 "Categories"。How to get "Tags", "Description" and "Categories".
  • 如何获取“特定领域”的信息(名人)。How to get "Domain-specific" information (celebrities).

先决条件Prerequisites

  • 本地存储图像的图像 URL 或路径。Image URL or path to locally stored image.
  • 支持的输入方法:原始图像二进制,采用应用程序/八位字节流或图像 URL 的形式Supported input methods: Raw image binary in the form of an application/octet stream or image URL
  • 支持的图像格式:JPEG、PNG、GIF 和 BMPSupported image formats: JPEG, PNG, GIF, BMP
  • 图像文件大小:小于 4MBImage file size: Less than 4MB
  • 图像维度:大于 50 x 50 像素Image dimension: Greater than 50 x 50 pixels

下面的示例演示了以下功能:In the examples below, the following features are demonstrated:

  1. 分析图像并返回标记数组和说明。Analyzing an image and getting an array of tags and a description returned.
  2. 使用特定领域的模型(具体说来就是“名人”模型)分析图像,并在返回的 JSON 中获取相应的结果。Analyzing an image with a domain-specific model (specifically, "celebrities" model) and getting the corresponding result in JSON retune.

功能细分为:Features are broken down on:

  • 选项一: 范围内分析 - 仅分析给定模型Option One: Scoped Analysis - Analyze only a given model
  • 选项二: 强化分析 - 经过分析可提供具有 86 个类别分类的更多详细信息Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

授权 API 调用Authorize the API call

每次调用计算机视觉 API 都需要提供订阅密钥。Every call to the Computer Vision API requires a subscription key. 需通过查询字符串参数传递此密钥,或者在请求标头中指定此密钥。This key needs to be either passed through a query string parameter or specified in the request header.

你可以按照创建认知服务帐户中的说明订阅计算机视觉并获取密钥。You can follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

  1. 若要通过查询字符串传递订阅密钥,请参阅下面的计算机视觉 API 示例:Passing the subscription key through a query string, see below as a Computer Vision API example:

    https://api.cognitive.azure.cn/vision/v2.0/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription key>

  2. 也可以在 HTTP 请求标头中指定订阅密钥的传递方式:Passing the subscription key can also be specified in the HTTP request header:

    ocp-apim-subscription-key: <Your subscription key>

  3. 使用客户端库时,订阅密钥通过 VisionServiceClient 的构造函数传入:When using the client library, the subscription key is passed in through the constructor of VisionServiceClient:

    var visionClient = new VisionServiceClient("Your subscriptionKey");

将图像上传到计算机视觉 API 服务并取回标记、说明和名人Upload an image to the Computer Vision API service and get back tags, descriptions and celebrities

若要执行计算机视觉 API 调用,基本方式是直接上传图像。The basic way to perform the Computer Vision API call is by uploading an image directly. 为此,可将包含 application/octet-stream 内容类型的“POST”请求连同从图像中读取的数据一起发送。This is done by sending a "POST" request with application/octet-stream content type together with the data read from the image. 至于 "Tags" 和 "Description",此上传方法对于所有计算机视觉 API 调用都是相同的。For "Tags" and "Description", this upload method will be the same for all the Computer Vision API calls. 唯一的区别是用户指定的查询参数。The only difference will be the query parameters the user specifies.

下面介绍如何获取给定图像的 "Tags" 和 "Description":Here’s how to get "Tags" and "Description" for a given image:

选项一: 获取“标记”列表和一个“说明”Option One: Get list of "Tags" and one "Description"

POST https://api.cognitive.azure.cn/vision/v2.0/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription key>
using Microsoft.ProjectOxford.Vision;
using Microsoft.ProjectOxford.Vision.Contract;
using System.IO;

AnalysisResult analysisResult;
var features = new VisualFeature[] { VisualFeature.Tags, VisualFeature.Description };

using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))
{
  analysisResult = await visionClient.AnalyzeImageAsync(fs, features);
}

选项二: 只获取 "Tags" 的列表,或者只获取 "Description" 的列表:Option Two Get list of "Tags" only, or list of "Description" only:

仅标记:Tags only:
POST https://api.cognitive.azure.cn/vision/v2.0/tag&subscription-key=<Your subscription key>
var analysisResult = await visionClient.GetTagsAsync("http://contoso.com/example.jpg");
仅说明:Description only:
POST https://api.cognitive.azure.cn/vision/v2.0/describe&subscription-key=<Your subscription key>
using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))
{
  analysisResult = await visionClient.DescribeAsync(fs);
}

获取特定领域的分析(名人)Get domain-specific analysis (celebrities)

选项一: 范围内分析 - 仅分析给定模型Option One: Scoped Analysis - Analyze only a given model

POST https://api.cognitive.azure.cn/vision/v2.0/models/celebrities/analyze
var celebritiesResult = await visionClient.AnalyzeImageInDomainAsync(url, "celebrities");

对于此选项来说,所有其他参数参数 {visualFeatures, details} 均无效。For this option, all other query parameters {visualFeatures, details} are not valid. 若要查看所有支持的模型,请使用:If you want to see all supported models, use:

GET https://api.cognitive.azure.cn/vision/v2.0/models 
var models = await visionClient.ListModelsAsync();

选项二: 强化分析 - 经过分析可提供具有 86 个类别分类的更多详细信息Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

如果应用程序用户除了获取一个或多个特定领域模型中的详细信息,还需要获取泛型图像分析信息,则可使用带模型查询参数的扩展型 v1 API。For applications where you want to get generic image analysis in addition to details from one or more domain-specific models, we extend the v1 API with the models query parameter.

POST https://api.cognitive.azure.cn/vision/v2.0/analyze?details=celebrities

调用此方法时,将先调用“86 类”分类器。When this method is invoked, we will call the 86-category classifier first. 如果某个类别与已知/匹配模型的类别相符,则会进行第二轮分类器调用。If any of the categories match that of a known/matching model, a second pass of classifier invocations will occur. 例如,如果 "details=all",或者 "details" 包括 ‘celebrities’,则会在调用“86 类”分类器后调用名人模型,结果就会包括人员类别。For example, if "details=all", or "details" include ‘celebrities’, we will call the celebrities model after the 86-category classifier is called and the result includes the category person. 与“选项一”相比,这会增加对名人感兴趣的用户的延迟。This will increase latency for users interested in celebrities, compared to Option One.

在这种情况下,所有 v1 查询参数都会表现得一样。All v1 query parameters will behave the same in this case. 如果未指定 visualFeatures=categories,则会隐式启用它。If visualFeatures=categories is not specified, it will be implicitly enabled.

检索并了解 JSON 输出以进行分析Retrieve and understand the JSON output for analysis

下面是一个示例:Here's an example:

{  
  "tags":[  
    {  
      "name":"outdoor",
      "score":0.976
    },
    {  
      "name":"bird",
      "score":0.95
    }
  ],
  "description":{  
    "tags":[  
      "outdoor",
      "bird"
    ],
    "captions":[  
      {  
        "text":"partridge in a pear tree",
        "confidence":0.96
      }
    ]
  }
}
字段Field 类型Type 内容Content
TagsTags object 标记数组的顶级对象Top-level object for array of tags
tags[].Nametags[].Name string 标记分类器中的关键字Keyword from tags classifier
tags[].Scoretags[].Score number 置信度,介于 0 和 1 之间。Confidence score, between 0 and 1.
说明description object 说明的顶级对象。Top-level object for a description.
description.tags[]description.tags[] string 标记列表。List of tags. 如果因置信度不够而无法生成标题,则调用方能够获得的唯一信息可能就是标记。If there insufficient confidence in the ability to produce a caption, the tags maybe the only information available to the caller.
description.captions[].textdescription.captions[].text string 描述图像的短语。A phrase describing the image.
description.captions[].confidencedescription.captions[].confidence number 短语的置信度。Confidence for the phrase.

检索并了解特定于域的模型的 JSON 输出Retrieve and understand the JSON output of domain-specific models

选项一: 范围内分析 - 仅分析给定模型Option One: Scoped Analysis - Analyze only a given model

输出将会是标记数组,示例如下:The output will be an array of tags, an example will be like this example:

{  
  "result":[  
    {  
      "name":"golden retriever",
      "score":0.98
    },
    {  
      "name":"Labrador retriever",
      "score":0.78
    }
  ]
}

选项二: 强化分析 - 经过分析可提供具有 86 个类别分类的更多详细信息Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

对于使用“选项二(强化分析)”的特定领域模型,会扩展类别返回类型。For domain-specific models using Option Two (Enhanced Analysis), the categories return type is extended. 示例如下:An example follows:

{  
  "requestId":"87e44580-925a-49c8-b661-d1c54d1b83b5",
  "metadata":{  
    "width":640,
    "height":430,
    "format":"Jpeg"
  },
  "result":{  
    "celebrities":[  
      {  
        "name":"Richard Nixon",
        "faceRectangle":{  
          "left":107,
          "top":98,
          "width":165,
          "height":165
        },
        "confidence":0.9999827
      }
    ]
  }
}

类别字段是一个列表,其中包含一个或多个 86 类(按原始分类)。The categories field is a list of one or more of the 86-categories in the original taxonomy. 另请注意,以下划线结尾的类别会匹配该类别及其子类别(例如,名人模型的 people_ 和 people_group)。Note also that categories ending in an underscore will match that category and its children (for example, people_ as well as people_group, for celebrities model).

字段Field 类型Type 内容Content
Categoriescategories object 顶级对象Top-level object
categories[].namecategories[].name string “86 类”分类中的名称Name from 86-category taxonomy
categories[].scorecategories[].score number 置信度,介于 0 和 1 之间Confidence score, between 0 and 1
categories[].detailcategories[].detail object? 可选详细信息对象Optional detail object

请注意,如果多个类别匹配(例如,当 model=celebrities 时,“86 类”分类器返回了一个针对 people_ 和 people_young 的置信度),则会将详细信息附加到最广泛级别的匹配(即该示例中的 people_)。Note that if multiple categories match (for example, 86-category classifier returns a score for both people_ and people_young when model=celebrities), the details are attached to the most general level match (people_ in that example.)

错误响应Errors Responses

这些内容与 vision.analyze 相同,但增加了 NotSupportedModel 错误 (HTTP 400),该错误可能会在使用“选项一”和“选项二”的情况下返回。These are identical to vision.analyze, with the additional error of NotSupportedModel error (HTTP 400), which may be returned in both Option One and Option Two scenarios. 对于“选项二(强化分析)”,如果无法识别详细指定的任何模型,则 API 会返回 NotSupportedModel,即使其中的一个或多个是有效的。For Option Two (Enhanced Analysis), if any of the models specified in details are not recognized, the API will return a NotSupportedModel, even if one or more of them are valid. 用户可以调用 listModels,了解哪些模型受支持。Users can call listModels to find out what models are supported.

后续步骤Next steps

若要使用 REST API,请转到计算机视觉 API 参考To use the REST API, go to Computer Vision API Reference.