什么是计算机视觉?What is Computer Vision?


现在,将对此服务的所有 HTTP 请求强制执行 TLS 1.2。TLS 1.2 is now enforced for all HTTP requests to this service.

使用 Azure 的计算机视觉服务,你可以访问高级算法,这些算法根据你感兴趣的视觉功能处理图像并返回信息。Azure's Computer Vision service gives you access to advanced algorithms that process images and return information based on the visual features you're interested in. 例如,计算机视觉可以确定图像是否包含成人内容、查找特定的品牌或物体或查找人脸。For example, Computer Vision can determine whether an image contains adult content, find specific brands or objects, or find human faces.

可使用客户端库 SDK 或直接调用 REST API 来创建计算机视觉应用程序。You can create Computer Vision applications through a client library SDK or by calling the REST API directly. 此页广泛地介绍了计算机视觉的功能。This page broadly covers what you can do with Computer Vision.

用于数字资产管理的计算机视觉Computer Vision for digital asset management

计算机视觉可以支持许多数字资产管理 (DAM) 方案。Computer Vision can power many digital asset management (DAM) scenarios. DAM 是组织、存储和检索富媒体资产以及管理数字权利和权限的业务流程。DAM is the business process of organizing, storing, and retrieving rich media assets and managing digital rights and permissions. 例如,公司可能希望基于可见徽标、面部、物体、颜色等来分组和标识图像。For example, a company may want to group and identify images based on visible logos, faces, objects, colors, and so on. 或者,你可能希望自动生成图像的标题,并附加关键字,使其可供搜索。Or, you might want to automatically generate captions for images and attach keywords so they're searchable. 有关使用认知服务、Azure 认知搜索和智能报表的一体式 DAM 解决方案,请参阅 GitHub 上的知识挖掘解决方案加速器指南For an all-in-one DAM solution using Cognitive Services, Azure Cognitive Search, and intelligent reporting, see the Knowledge Mining Solution Accelerator Guide on GitHub. 有关其他 DAM 示例,请参阅计算机视觉解决方案模板存储库。For other DAM examples, see the Computer Vision Solution Templates repository.

通过分析图像来获取见解Analyze images for insight

可以分析图像,以便提供有关视觉特性和特征的见解。You can analyze images to provide insights about their visual features and characteristics. 下表中的所有特性由分析图像 API 提供。All of the features in the table below are provided by the Analyze Image API. 快速入门的说明开始操作。Follow a quickstart to get started.

标记视觉特性Tag visual features

根据数千个可识别对象、生物、风景和操作识别并标记图像中的视觉特征。Identify and tag visual features in an image, from a set of thousands of recognizable objects, living things, scenery, and actions. 如果标记含混不清或者不常见,API 响应会做出提示,阐明上下文或标记。When the tags are ambiguous or not common knowledge, the API response provides hints to clarify the context of the tag. 标记并不局限于主体(如前景中的人员),还包括设置(室内或室外)、家具、工具、植物、动物、附件、小配件等。Tagging isn't limited to the main subject, such as a person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on. 标记视觉特性Tag visual features

检测物体Detect objects

对象检测类似于添加标记,但 API 返回应用于每个标记的边框坐标。Object detection is similar to tagging, but the API returns the bounding box coordinates for each tag applied. 例如,如果图像包含狗、猫和人,检测操作将列出这些对象及其在图像中的坐标。For example, if an image contains a dog, cat and person, the Detect operation will list those objects together with their coordinates in the image. 可以使用此功能进一步处理图像中各对象之间的关系。You can use this functionality to process further relationships between the objects in an image. 当图像中有多个相同标记的实例时,还会通知你。It also lets you know when there are multiple instances of the same tag in an image. 检测物体Detect objects

检测品牌Detect brands

根据一个包含数千全球徽标的数据库,确定图像或视频中的商业品牌。Identify commercial brands in images or videos from a database of thousands of global logos. 可以使用此功能来执行特定的操作,例如,发现哪些品牌在社交媒体上最受欢迎,或者哪些品牌在社交产品排名上最靠前。You can use this feature, for example, to discover which brands are most popular on social media or most prevalent in media product placement. 检测品牌Detect brands

对图像分类Categorize an image

使用具有父/子遗传层次结构的类别分类对整个图像进行标识和分类。Identify and categorize an entire image, using a category taxonomy with parent/child hereditary hierarchies. 类别可单独使用或与我们的新标记模型结合使用。Categories can be used alone, or with our new tagging models.
目前,英语是唯一可以对图像进行标记和分类的语言。Currently, English is the only supported language for tagging and categorizing images. 对图像分类Categorize an image

描述图像Describe an image

使用完整的句子,以人类可读语言生成整个图像的说明。Generate a description of an entire image in human-readable language, using complete sentences. 计算机视觉算法可根据图像中标识的对象生成各种说明。Computer Vision's algorithms generate various descriptions based on the objects identified in the image. 分别对这些说明进行评估并生成置信度分数。The descriptions are each evaluated and a confidence score generated. 然后将返回置信度分数从高到低的列表。A list is then returned ordered from highest confidence score to lowest. 描述图像Describe an image

检测人脸Detect faces

检测图像中的人脸,提供每个检测到的人脸的相关信息。Detect faces in an image and provide information about each detected face. 计算机视觉返回每个检测到的人脸的坐标、矩形、性别和年龄。Computer Vision returns the coordinates, rectangle, gender, and age for each detected face.
计算机视觉提供了人脸服务功能的子集。Computer Vision provides a subset of the Face service functionality. 可以使用“人脸”服务进行更详细的分析,如面部识别和姿势检测。You can use the Face service for more detailed analysis, such as facial identification and pose detection. 检测人脸Detect faces

检测图像类型Detect image types

检测图像特征,例如图像是否为素描,或者图像是剪贴画的可能性。Detect characteristics about an image, such as whether an image is a line drawing or the likelihood of whether an image is clip art. 检测图像类型Detect image types

检测特定于域的内容Detect domain-specific content

使用域模型来检测和标识图像中特定领域的内容,例如名人和地标。Use domain models to detect and identify domain-specific content in an image, such as celebrities and landmarks. 例如,如果图像中包含人物,则计算机视觉可以使用针对名人的域模型来确定图像中检测到的人物是否为已知名人。For example, if an image contains people, Computer Vision can use a domain model for celebrities to determine if the people detected in the image are known celebrities. 检测特定领域的内容Detect domain-specific content

检测颜色方案Detect the color scheme

分析图像中的颜色使用情况。Analyze color usage within an image. 计算机视觉可以确定图像是黑白的还是彩色的,而对于彩色图像,又可以确定主色和主题色。Computer Vision can determine whether an image is black & white or color and, for color images, identify the dominant and accent colors. 检测颜色方案Detect the color scheme

生成缩略图Generate a thumbnail

分析图像的内容,生成该图像的相应缩略图。Analyze the contents of an image to generate an appropriate thumbnail for that image. 计算机视觉首先生成高质量缩略图,然后通过分析图像中的对象来确定“感兴趣区域”。Computer Vision first generates a high-quality thumbnail and then analyzes the objects within the image to determine the area of interest. 然后,计算机视觉会裁剪图像以满足感兴趣区域的要求。Computer Vision then crops the image to fit the requirements of the area of interest. 可以根据用户需求,使用与原始图像的纵横比不同的纵横比显示生成的缩略图。The generated thumbnail can be presented using an aspect ratio that is different from the aspect ratio of the original image, depending on your needs. 生成缩略图Generate a thumbnail

获取感兴趣区域Get the area of interest

分析图像内容,以返回“感兴趣区域”的坐标。Analyze the contents of an image to return the coordinates of the area of interest. 计算机视觉并没有裁剪图像和生成缩略图,而是返回该区域的边框坐标,因此,进行调用的应用程序可以根据需要修改原始图像。Instead of cropping the image and generating a thumbnail, Computer Vision returns the bounding box coordinates of the region, so the calling application can modify the original image as desired. 获取感兴趣区域Get the area of interest

管理图像中的内容Moderate content in images

可以使用计算机视觉检测图像中的成人内容,并返回不同分类的置信度分数。You can use Computer Vision to detect adult content in an image and return confidence scores for different classifications. 可以在滑尺上设置标记内容的阈值,以适应首选项。The threshold for flagging content can be set on a sliding scale to accommodate your preferences.

图像要求Image requirements

计算机视觉可以分析符合以下要求的图像:Computer Vision can analyze images that meet the following requirements:

  • 图像必须以 JPEG、PNG、GIF 或 BMP 格式显示The image must be presented in JPEG, PNG, GIF, or BMP format
  • 图像的文件大小必须不到 4 兆字节 (MB)The file size of the image must be less than 4 megabytes (MB)
  • 图像的尺寸必须大于 50 x 50 像素The dimensions of the image must be greater than 50 x 50 pixels
    • 对于读取 API,图像的尺寸必须介于 50 x 50 和 10000 x 10000 像素之间。For the Read API, the dimensions of the image must be between 50 x 50 and 10000 x 10000 pixels.

数据隐私和安全性Data privacy and security

与所有认知服务一样,使用计算机视觉服务的开发人员应该了解 Microsoft 针对客户数据的政策。As with all of the Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's policies on customer data. 请参阅 Microsoft 信任中心上的“认知服务”页面来了解详细信息。See the Cognitive Services page on the Microsoft Trust Center to learn more.

后续步骤Next steps

采用所选开发语言按照快速入门指南开始使用计算机视觉:Get started with Computer Vision by following the quickstart guide in your preferred development language: