What's new in Azure AI Vision
Learn what's new in Azure AI Vision. Check this page to stay up to date with new features, enhancements, fixes, and documentation updates.
September 2022
Azure AI Vision 3.0/3.1 Read previews deprecation
The preview versions of the Azure AI Vision 3.0 and 3.1 Read API are scheduled to be retired on January 31, 2023. Customers are encouraged to refer to the How-To and QuickStarts to get started with the generally available (GA) version of the Read API instead. The latest GA versions provide the following benefits:
- 2022 latest generally available OCR model
- Significant expansion of OCR language coverage including support for handwritten text
- Improved OCR quality
June 2022
Responsible AI for Face
Face transparency note
- The transparency note provides guidance to assist our customers to improve the accuracy and fairness of their systems by incorporating meaningful human review to detect and resolve cases of misidentification or other failures, providing support to people who believe their results were incorrect, and identifying and addressing fluctuations in accuracy due to variations in operational conditions.
Retirement of sensitive attributes
- We have retired facial analysis capabilities that purport to infer emotional states and identity attributes, such as gender, age, smile, facial hair, hair, and makeup.
- Facial detection capabilities, (including detecting blur, exposure, glasses, headpose, landmarks, noise, occlusion, facial bounding box) will remain generally available and don't require an application.
Fairlearn package and Microsoft's Fairness Dashboard
- The open-source Fairlearn package and Microsoft’s Fairness Dashboard aims to support customers to measure the fairness of Microsoft's facial verification algorithms on their own data, allowing them to identify and address potential fairness issues that could affect different demographic groups before they deploy their technology.
Azure AI Vision 3.2-preview deprecation
The preview versions of the 3.2 API are scheduled to be retired in December of 2022. Customers are encouraged to use the generally available (GA) version of the API instead. Mind the following changes when migrating from the 3.2-preview versions:
- The Analyze Image and Read API calls now take an optional model-version parameter that you can use to specify which AI model to use. By default, they use the latest model.
- The Analyze Image and Read API calls also return a
model-version
field in successful API responses. This field reports which model was used. - Image Analysis APIs now use a different error-reporting format. See the API reference documentation to learn how to adjust any error-handling code.
May 2022
OCR (Read) API model is generally available (GA)
Azure AI Vision's OCR (Read) API latest model with 164 supported languages is now generally available as a cloud service and container.
- OCR support for print text expands to 164 languages including Russian, Arabic, Hindi and other languages using Cyrillic, Arabic, and Devanagari scripts.
- OCR support for handwritten text expands to 9 languages with English, Chinese Simplified, French, German, Italian, Japanese, Korean, Portuguese, and Spanish.
- Enhanced support for single characters, handwritten dates, amounts, names, other entities commonly found in receipts and invoices.
- Improved processing of digital PDF documents.
- Input file size limit increased 10x to 500 MB.
- Performance and latency improvements.
See the OCR how-to guide to learn how to use the GA model.
February 2022
OCR (Read) API Public Preview supports 164 languages
Azure AI Vision's OCR (Read) API expands supported languages to 164 with its latest preview:
- OCR support for print text expands to 42 new languages including Arabic, Hindi, and other languages using Arabic and Devanagari scripts.
- OCR support for handwritten text expands to Japanese and Korean in addition to English, Chinese Simplified, French, German, Italian, Portuguese, and Spanish.
- Enhancements including better support for extracting handwritten dates, amounts, names, and single character boxes.
- General performance and AI quality improvements
See the OCR how-to guide to learn how to use the new preview features.
September 2021
OCR (Read) API Public Preview supports 122 languages
Azure AI Vision's OCR (Read) API expands supported languages to 122 with its latest preview:
- OCR support for print text in 49 new languages including Russian, Bulgarian, and other Cyrillic and more Latin languages.
- OCR support for handwritten text in 6 new languages that include English, Chinese Simplified, French, German, Italian, Portuguese, and Spanish.
- Enhancements for processing digital PDFs and Machine Readable Zone (MRZ) text in identity documents.
- General performance and AI quality improvements
See the OCR how-to guide to learn how to use the new preview features.
August 2021
Image tagging language expansion
The latest version (v3.2) of the Image tagger now supports tags in 50 languages. See the language support page for more information.
July 2021
New HeadPose and Landmarks improvements for Detection_03
- The Detection_03 model has been updated to support facial landmarks.
- The landmarks feature in Detection_03 is much more precise, especially in the eyeball landmarks, which are crucial for gaze tracking.
May 2021
Spatial Analysis container update
A new version of the Spatial Analysis container has been released with a new feature set. This Docker container lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments.
Spatial Analysis operations can be now configured to detect the orientation that a person is facing.
- An orientation classifier can be enabled for the
personcrossingline
andpersoncrossingpolygon
operations by configuring theenable_orientation
parameter. It is set to off by default.
- An orientation classifier can be enabled for the
Spatial Analysis operations now also offers configuration to detect a person's speed while walking/running
- Speed can be detected for the
personcrossingline
andpersoncrossingpolygon
operations by turning on theenable_speed
classifier, which is off by default. The output is reflected in thespeed
,avgSpeed
, andminSpeed
outputs.
- Speed can be detected for the
April 2021
Azure AI Vision v3.2 GA
The Azure AI Vision API v3.2 is now generally available with the following updates:
- Improved image tagging model: analyzes visual content and generates relevant tags based on objects, actions, and content displayed in the image. This model is available through the Tag Image API. See the Image Analysis how-to guide and overview to learn more.
- Updated content moderation model: detects presence of adult content and provides flags to filter images containing adult, racy, and gory visual content. This model is available through the Analyze API. See the Image Analysis how-to guide and overview to learn more.
- OCR (Read) available for 73 languages including Simplified and Traditional Chinese, Japanese, Korean, and Latin languages.
March 2021
Azure AI Vision 3.2 Public Preview update
The Azure AI Vision API v3.2 public preview has been updated. The preview release has all Azure AI Vision features along with updated Read and Analyze APIs.
February 2021
Read API v3.2 Public Preview with OCR support for 73 languages
The Azure AI Vision Read API v3.2 public preview, available as cloud service and Docker container, includes these updates:
- OCR for 73 languages including Simplified and Traditional Chinese, Japanese, Korean, and Latin languages.
- Natural reading order for the text line output (Latin languages only)
- Handwriting style classification for text lines along with a confidence score (Latin languages only).
- Extract text only for selected pages for a multi-page document.
See the Read API how-to guide to learn more.
January 2021
Spatial Analysis container update
A new version of the Spatial Analysis container has been released with a new feature set. This Docker container lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments.
- Spatial Analysis operations can be now configured to detect if a person is wearing a protective face covering such as a mask.
- A mask classifier can be enabled for the
personcount
,personcrossingline
andpersoncrossingpolygon
operations by configuring theENABLE_FACE_MASK_CLASSIFIER
parameter. - The attributes
face_mask
andface_noMask
will be returned as metadata with confidence score for each person detected in the video stream
- A mask classifier can be enabled for the
- The personcrossingpolygon operation has been extended to allow the calculation of the dwell time a person spends in a zone. You can set the
type
parameter in the Zone configuration for the operation tozonedwelltime
and a new event of type personZoneDwellTimeEvent will include thedurationMs
field populated with the number of milliseconds that the person spent in the zone. - Breaking change: The personZoneEvent event has been renamed to personZoneEnterExitEvent. This event is raised by the personcrossingpolygon operation when a person enters or exits the zone and provides directional info with the numbered side of the zone that was crossed.
- Video URL can be provided as "Private Parameter/obfuscated" in all operations. Obfuscation is optional now and it will only work if
KEY
andIV
are provided as environment variables. - Calibration is enabled by default for all operations. Set the
do_calibration: false
to disable it. - Added support for auto recalibration (by default disabled) via the
enable_recalibration
parameter, please refer to Spatial Analysis operations for details - Camera calibration parameters to the
DETECTOR_NODE_CONFIG
. Refer to Spatial Analysis operations for details.
December 2020
Customer configuration for Face ID storage
- While the Face Service does not store customer images, the extracted face feature(s) will be stored on server. The Face ID is an identifier of the face feature and will be used in Face - Identify, Face - Verify, and Face - Find Similar. The stored face features will expire and be deleted 24 hours after the original detection call. Customers can now determine the length of time these Face IDs are cached. The maximum value is still up to 24 hours, but a minimum value of 60 seconds can now be set. The new time ranges for Face IDs being cached is any value between 60 seconds and 24 hours. More details can be found in the Face - Detect API reference (the faceIdTimeToLive parameter).
October 2020
Azure AI Vision API v3.1 GA
The Azure AI Vision API in General Availability has been upgraded to v3.1.
September 2020
Spatial Analysis container preview
The Spatial Analysis container is now in preview. The Spatial Analysis feature of Azure AI Vision lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments. Spatial Analysis is a Docker container you can use on-premises.
Read API v3.1 Public Preview adds OCR for Japanese
The Azure AI Vision Read API v3.1 public preview adds these capabilities:
OCR for Japanese language
For each text line, indicate whether the appearance is Handwriting or Print style, along with a confidence score (Latin languages only).
For a multi-page document extract text only for selected pages or page range.
This preview version of the Read API supports English, Dutch, French, German, Italian, Japanese, Portuguese, Simplified Chinese, and Spanish languages.
See the Read API how-to guide to learn more.
July 2020
Read API v3.1 Public Preview with OCR for Simplified Chinese
The Azure AI Vision Read API v3.1 public preview adds support for Simplified Chinese.
- This preview version of the Read API supports English, Dutch, French, German, Italian, Portuguese, Simplified Chinese, and Spanish languages.
See the Read API how-to guide to learn more.
May 2020
Azure AI Vision API v3.0 entered General Availability, with updates to the Read API:
- Support for English, Dutch, French, German, Italian, Portuguese, and Spanish
- Improved accuracy
- Confidence score for each extracted word
- New output format
See the OCR overview to learn more.
March 2020
- TLS 1.2 is now enforced for all HTTP requests to this service. For more information, see Azure AI services security.
January 2020
Read API 3.0 Public Preview
You now can use version 3.0 of the Read API to extract printed or handwritten text from images. Compared to earlier versions, 3.0 provides:
- Improved accuracy
- New output format
- Confidence score for each extracted word
- Support for both Spanish and English languages with the language parameter
Follow an Extract text quickstart to get starting using the 3.0 API.
April 2019
Improved attribute accuracy
- Improved overall accuracy of the
age
andheadPose
attributes. TheheadPose
attribute is also updated with thepitch
value enabled now. Use these attributes by specifying them in thereturnFaceAttributes
parameter of Face - DetectreturnFaceAttributes
parameter.
Improved processing speeds
- Improved speeds of Face - Detect, FaceList - Add Face, LargeFaceList - Add Face, PersonGroup Person - Add Face and LargePersonGroup Person - Add Face operations.
January 2019
Face Snapshot feature
- This feature allows the service to support data migration across subscriptions: Snapshot.
Important
As of June 30, 2023, the Face Snapshot API is retired.
October 2018
API messages
- Refined description for
status
,createdDateTime
,lastActionDateTime
, andlastSuccessfulTrainingDateTime
in PersonGroup - Get Training Status, LargePersonGroup - Get Training Status, and LargeFaceList - Get Training Status.
May 2018
Improved attribute accuracy
- Improved
gender
attribute significantly and also improvedage
,glasses
,facialHair
,hair
,makeup
attributes. Use them through Face - DetectreturnFaceAttributes
parameter.
Increased file size limit
- Increased input image file size limit from 4 MB to 6 MB in Face - Detect, FaceList - Add Face, LargeFaceList - Add Face, PersonGroup Person - Add Face and LargePersonGroup Person - Add Face.
May 2017
New detectable Face attributes
- Added
hair
,makeup
,accessory
,occlusion
,blur
,exposure
, andnoise
attributes in Face - DetectreturnFaceAttributes
parameter. - Supported 10K persons in a PersonGroup and Face - Identify.
- Supported pagination in PersonGroup Person - List with optional parameters:
start
andtop
. - Supported concurrency in adding/deleting faces against different FaceLists and different persons in PersonGroup.
March 2017
New detectable Face attribute
- Added
emotion
attribute in Face - DetectreturnFaceAttributes
parameter.
Fixed issues
- Face could not be re-detected with rectangle returned from Face - Detect as
targetFace
in FaceList - Add Face and PersonGroup Person - Add Face. - The detectable face size is set to ensure it is strictly between 36x36 to 4096x4096 pixels.
November 2016
New subscription tier
- Added Face Storage Standard subscription to store additional persisted faces when using PersonGroup Person - Add Face or FaceList - Add Face for identification or similarity matching. The stored images are charged at $0.5 per 1000 faces and this rate is prorated on a daily basis. Free tier subscriptions continue to be limited to 1,000 total persons.
October 2016
API messages
- Changed the error message of more than one face in the
targetFace
from 'There are more than one face in the image' to 'There is more than one face in the image' in FaceList - Add Face and PersonGroup Person - Add Face.
July 2016
New features
- Supported Face to Person object authentication in Face - Verify.
- Added optional
mode
parameter enabling selection of two working modes:matchPerson
andmatchFace
in Face - Find Similar and default ismatchPerson
. - Added optional
confidenceThreshold
parameter for user to set the threshold of whether one face belongs to a Person object in Face - Identify. - Added optional
start
andtop
parameters in PersonGroup - List to enable user to specify the start point and the total PersonGroups number to list.
V1.0 changes from V0
- Updated service root endpoint from
https://api.cognitive.azure.cn/face/v0/
tohttps://api.cognitive.azure.cn/face/v1.0/
. Changes applied to: Face - Detect, Face - Identify, Face - Find Similar and Face - Group. - Updated the minimal detectable face size to 36x36 pixels. Faces smaller than 36x36 pixels will not be detected.
- Deprecated the PersonGroup and Person data in Face V0. Those data cannot be accessed with the Face V1.0 service.
- Deprecated the V0 endpoint of Face API on June 30, 2016.