Migrate from prebuilt standard voice to prebuilt neural voice
Important
We are retiring the standard voices from September 1, 2021 through August 31, 2024. Speech resources created after September 1, 2021 could never use standard voices. We are gradually sunsetting standard voice support for Speech resources created prior to September 1, 2021. By August 31, 2024 the standard voices won’t be available for all customers. You can choose from the supported neural voice names.
The pricing for prebuilt standard voice is different from prebuilt neural voice. Go to the pricing page.
The prebuilt neural voice provides more natural sounding speech output, and thus, a better end-user experience.
Prebuilt standard voice | Prebuilt neural voice |
---|---|
Noticeably robotic | Natural sounding, closer to human-parity |
Limited capabilities in voice tuning1 | Advanced capabilities in voice tuning |
No new investment in future voice fonts | On-going investment in future voice fonts |
1 For voice tuning, volume and pitch changes can be applied to standard voices at the word or sentence-level, whereas they can only be applied to neural voices at the sentence level. Duration supports standard voices only. To learn more about details on prosody elements, see Improve synthesis with SSML.
Action required
Tip
Even without an Azure account, you can listen to voice samples at the Voice Gallery and determine the right voice for your business needs.
- Review the price structure.
- To make the change, follow the sample code to update the voice name in your speech synthesis request to the supported neural voice names in chosen languages. Use neural voices for your speech synthesis request, on cloud or on prem. For on-premises container, use the neural voice containers.
Standard voice details (deprecated)
Read the following sections for details on standard voice.
Language support
More than 75 prebuilt standard voices are available in over 45 languages and locales, which allow you to convert text into synthesized speech.
Note
With two exceptions, standard voices are created from samples that use a 16 khz sample rate. The en-US-AriaRUS and en-US-GuyRUS voices are also created from samples that use a 24 khz sample rate. All voices can upsample or downsample to other sample rates when synthesizing.
Language | Locale (BCP-47) | Gender | Voice name |
---|---|---|---|
Arabic (Arabic ) | ar-EG |
Female | ar-EG-Hoda |
Arabic (Saudi Arabia) | ar-SA |
Male | ar-SA-Naayf |
Bulgarian (Bulgaria) | bg-BG |
Male | bg-BG-Ivan |
Catalan | ca-ES |
Female | ca-ES-HerenaRUS |
Chinese (Cantonese, Traditional) | zh-HK |
Male | zh-HK-Danny |
Chinese (Cantonese, Traditional) | zh-HK |
Female | zh-HK-TracyRUS |
Chinese (Mandarin, Simplified) | zh-cn |
Female | zh-cn-HuihuiRUS |
Chinese (Mandarin, Simplified) | zh-cn |
Male | zh-cn-Kangkang |
Chinese (Mandarin, Simplified) | zh-cn |
Female | zh-cn-Yaoyao |
Chinese (Taiwanese Mandarin) | zh-TW |
Female | zh-TW-HanHanRUS |
Chinese (Taiwanese Mandarin) | zh-TW |
Female | zh-TW-Yating |
Chinese (Taiwanese Mandarin) | zh-TW |
Male | zh-TW-Zhiwei |
Croatian (Croatia) | hr-HR |
Male | hr-HR-Matej |
Czech (Czech Republic) | cs-CZ |
Male | cs-CZ-Jakub |
Danish (Denmark) | da-DK |
Female | da-DK-HelleRUS |
Dutch (Netherlands) | nl-NL |
Female | nl-NL-HannaRUS |
English (Australia) | en-AU |
Female | en-AU-Catherine |
English (Australia) | en-AU |
Female | en-AU-HayleyRUS |
English (Canada) | en-CA |
Female | en-CA-HeatherRUS |
English (Canada) | en-CA |
Female | en-CA-Linda |
English (India) | en-IN |
Female | en-IN-Heera |
English (India) | en-IN |
Female | en-IN-PriyaRUS |
English (India) | en-IN |
Male | en-IN-Ravi |
English (Ireland) | en-IE |
Male | en-IE-Sean |
English (United Kingdom) | en-GB |
Male | en-GB-George |
English (United Kingdom) | en-GB |
Female | en-GB-HazelRUS |
English (United Kingdom) | en-GB |
Female | en-GB-Susan |
English (United States) | en-US |
Male | en-US-BenjaminRUS |
English (United States) | en-US |
Male | en-US-GuyRUS |
English (United States) | en-US |
Female | en-US-AriaRUS |
English (United States) | en-US |
Female | en-US-ZiraRUS |
Finnish (Finland) | fi-FI |
Female | fi-FI-HeidiRUS |
French (Canada) | fr-CA |
Female | fr-CA-Caroline |
French (Canada) | fr-CA |
Female | fr-CA-HarmonieRUS |
French (France) | fr-FR |
Female | fr-FR-HortenseRUS |
French (France) | fr-FR |
Female | fr-FR-Julie |
French (France) | fr-FR |
Male | fr-FR-Paul |
French (Switzerland) | fr-CH |
Male | fr-CH-Guillaume |
German (Austria) | de-AT |
Male | de-AT-Michael |
German (Germany) | de-DE |
Female | de-DE-HeddaRUS |
German (Germany) | de-DE |
Male | de-DE-Stefan |
German (Switzerland) | de-CH |
Male | de-CH-Karsten |
Greek (Greece) | el-GR |
Male | el-GR-Stefanos |
Hebrew (Israel) | he-IL |
Male | he-IL-Asaf |
Hindi (India) | hi-IN |
Male | hi-IN-Hemant |
Hindi (India) | hi-IN |
Female | hi-IN-Kalpana |
Hungarian (Hungary) | hu-HU |
Male | hu-HU-Szabolcs |
Indonesian (Indonesia) | id-ID |
Male | id-ID-Andika |
Italian (Italy) | it-IT |
Male | it-IT-Cosimo |
Italian (Italy) | it-IT |
Female | it-IT-LuciaRUS |
Japanese (Japan) | ja-JP |
Female | ja-JP-Ayumi |
Japanese (Japan) | ja-JP |
Female | ja-JP-HarukaRUS |
Japanese (Japan) | ja-JP |
Male | ja-JP-Ichiro |
Korean (Korea) | ko-KR |
Female | ko-KR-HeamiRUS |
Malay (Malaysia) | ms-MY |
Male | ms-MY-Rizwan |
Norwegian (Bokmål, Norway) | nb-NO |
Female | nb-NO-HuldaRUS |
Polish (Poland) | pl-PL |
Female | pl-PL-PaulinaRUS |
Portuguese (Brazil) | pt-BR |
Male | pt-BR-Daniel |
Portuguese (Brazil) | pt-BR |
Female | pt-BR-HeloisaRUS |
Portuguese (Portugal) | pt-PT |
Female | pt-PT-HeliaRUS |
Romanian (Romania) | ro-RO |
Male | ro-RO-Andrei |
Russian (Russia) | ru-RU |
Female | ru-RU-EkaterinaRUS |
Russian (Russia) | ru-RU |
Female | ru-RU-Irina |
Russian (Russia) | ru-RU |
Male | ru-RU-Pavel |
Slovak (Slovakia) | sk-SK |
Male | sk-SK-Filip |
Slovenian (Slovenia) | sl-SI |
Male | sl-SI-Lado |
Spanish (Mexico) | es-MX |
Female | es-MX-HildaRUS |
Spanish (Mexico) | es-MX |
Male | es-MX-Raul |
Spanish (Spain) | es-ES |
Female | es-ES-HelenaRUS |
Spanish (Spain) | es-ES |
Female | es-ES-Laura |
Spanish (Spain) | es-ES |
Male | es-ES-Pablo |
Swedish (Sweden) | sv-SE |
Female | sv-SE-HedvigRUS |
Tamil (India) | ta-IN |
Male | ta-IN-Valluvar |
Telugu (India) | te-IN |
Female | te-IN-Chitra |
Thai (Thailand) | th-TH |
Male | th-TH-Pattara |
Turkish (Türkiye) | tr-TR |
Female | tr-TR-SedaRUS |
Vietnamese (Vietnam) | vi-VN |
Male | vi-VN-An |
Important
The en-US-Jessa
voice has changed to en-US-Aria
. If you were using "Jessa" before, convert over to "Aria".
You can continue to use the full service name mapping like "Microsoft Server Speech Text to Speech Voice (en-US, AriaRUS)" in your speech synthesis requests.
Regional support
Use this table to determine availability of standard voices by region/endpoint:
Region | Endpoint |
---|---|
China East 2 | https://chinaeast2.tts.speech.azure.cn/cognitiveservices/v1 |
China North 2 | https://chinanorth2.tts.speech.azure.cn/cognitiveservices/v1 |
China North 3 | https://chinanorth3.tts.speech.azure.cn/cognitiveservices/v1 |