使用人类可读语言描述图像Describe images with human-readable language

计算机视觉可以分析图像并生成描述其内容的人工可读的句子。Computer Vision can analyze an image and generate a human-readable sentence that describes its contents. 该算法实际返回基于不同视觉功能的多个描述,且每个描述都有一个置信度分数。The algorithm actually returns several descriptions based on different visual features, and each description is given a confidence score. 最终输出是按可信度从高到低排列的描述的列表。The final output is a list of descriptions ordered from highest to lowest confidence.

图像说明示例Image description example

以下 JSON 响应表明计算机视觉在基于视觉特征对示例图像进行描述时返回的内容。The following JSON response illustrates what Computer Vision returns when describing the example image based on its visual features.

曼哈顿建筑的黑白照片

{
    "description": {
        "tags": ["outdoor", "building", "photo", "city", "white", "black", "large", "sitting", "old", "water", "skyscraper", "many", "boat", "river", "group", "street", "people", "field", "tall", "bird", "standing"],
        "captions": [
            {
                "text": "a black and white photo of a city",
                "confidence": 0.95301952483304808
            },
            {
                "text": "a black and white photo of a large city",
                "confidence": 0.94085190563213816
            },
            {
                "text": "a large white building in a city",
                "confidence": 0.93108362931954824
            }
        ]
    },
    "requestId": "b20bfc83-fb25-4b8d-a3f8-b2a1f084b159",
    "metadata": {
        "height": 300,
        "width": 239,
        "format": "Jpeg"
    }
}

使用 APIUse the API

图像说明功能属于分析图像 API。The image description feature is part of the Analyze Image API. 可以通过本机 SDK 或 REST 调用来调用此 API。You can call this API through a native SDK or through REST calls. Description 包括在 visualFeatures 查询参数中。Include Description in the visualFeatures query parameter. 然后,在获取完整 JSON 响应时,就只需分析 "description" 部分内容的字符串。Then, when you get the full JSON response, simply parse the string for the contents of the "description" section.

后续步骤Next steps

了解标记图像对图像进行分类的相关概念。Learn the related concepts of tagging images and categorizing images.