什么是语音服务?What is the Speech service?

语音服务在单个 Azure 订阅中统合了语音转文本、文本转语音以及语音翻译功能。The Speech service is the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. 使用语音 CLI语音 SDK语音设备 SDKSpeech StudioREST API 可以轻松在应用程序、工具和设备中启用语音。It's easy to speech enable your applications, tools, and devices with the Speech CLI, Speech SDK, Speech Devices SDK, Speech Studio, or REST APIs.

以下功能是语音服务的一部分。The following features are part of the Speech service. 请使用下表中的链接详细了解每项功能的常见用例或浏览 API 参考信息。Use the links in this table to learn more about common use-cases for each feature, or browse the API reference.

服务Service 功能Feature 说明Description SDK 中 IsInRole 中的声明SDK RESTREST
语音转文本Speech-to-Text 实时语音转文本Realtime Speech-to-text 语音转文本可将音频流或本地文件实时转录或翻译为文本,应用程序、工具或设备可以使用或显示这些文本。Speech-to-text transcribes or translates audio streams or local files to text in realtime that your applications, tools, or devices can consume or display. 结合语言理解 (LUIS) 使用语音转文本可以从听录的语音中派生用户意向,以及处理语音命令。Use speech-to-text with Language Understanding (LUIS) to derive user intents from transcribed speech and act on voice commands. Yes Yes
批量语音转文本Batch Speech-to-Text 批量语音转文本支持对 Azure Blob 存储中存储的大量语音音频数据进行异步语音到文本转录。Batch Speech-to-text enables asynchronous speech-to-text transcription of large volumes of speech audio data stored in Azure Blob Storage. 除了将语音音频转换为文本,批量语音转文本还允许进行分割聚类和情感分析。In addition to converting speech audio to text, Batch Speech-to-text also allows for diarization and sentiment-analysis. No Yes
创建自定义语音识别模型Create Custom Speech Models 如果使用语音转文本在独特的环境中进行识别和听录,则可以创建并训练自定义的声学、语言和发音模型,以解决环境干扰或行业特定的词汇。If you are using speech-to-text for recognition and transcription in a unique environment, you can create and train custom acoustic, language, and pronunciation models to address ambient noise or industry-specific vocabulary. No Yes
文本转语音Text-to-Speech 文本转语音Text-to-speech 文本转语音可使用语音合成标记语言 (SSML) 将输入文本转换为类似人类的合成语音。Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). 可以选择标准语音或神经语音(请参阅语言支持)。Choose from standard voices and neural voices (see Language support). Yes Yes
语音翻译Speech Translation 语音翻译Speech translation 使用语音翻译可在应用程序、工具和设备中实现实时的多语言语音翻译。Speech translation enables real-time, multi-language translation of speech to your applications, tools, and devices. 进行语音转语音和语音转文本翻译时可以使用此服务。Use this service for speech-to-speech and speech-to-text translation. Yes No


现在,将对此服务的所有 HTTP 请求强制执行 TLS 1.2。TLS 1.2 is now enforced for all HTTP requests to this service.

免费试用语音服务Try the Speech service for free

若要完成以下步骤,需要一个 Azure 帐户。For the following steps, you need an Azure account. 如果你没有 Azure 帐户,可以在此处注册试用版。If you do not have an Azure account, you can sign up for trial at here.

创建 Azure 资源Create the Azure resource

若要将语音服务资源(试用层或付费层)添加到 Azure 帐户,请执行以下步骤:To add a Speech service resource (trial or paid tier) to your Azure account:

  1. 使用 Azure 帐户登录到 Azure 门户Sign in to the Azure portal using your Azure account.

  2. 选择门户左上角的“创建资源”。 Select Create a resource at the top left of the portal. 如果未看到“创建资源”,可通过选择屏幕左上角的折叠菜单找到它。If you do not see Create a resource, you can always find it by selecting the collapsed menu in the upper left corner of the screen.

  3. 在“新建”窗口中的搜索框内键入“语音”,然后按 ENTER。 In the New window, type "speech" in the search box and press ENTER.

  4. 在搜索结果中,选择“语音”。 In the search results, select Speech.


  5. 选择“创建”,然后: Select Create, then:

    • 为新资源指定唯一的名称。Give a unique name for your new resource. 名称有助于区分绑定到同一服务的多个订阅。The name helps you distinguish among multiple subscriptions tied to the same service.
    • 选择新资源关联的 Azure 订阅,以确定计费方式。Choose the Azure subscription that the new resource is associated with to determine how the fees are billed.
    • 选择将使用资源的区域Choose the region where the resource will be used.
    • 选择免费 (F0) 或付费 (S0) 定价层。Choose either a free (F0) or paid (S0) pricing tier. 若要查看每个层的定价和用量配额的完整信息,请选择“查看全部定价详细信息” 。For complete information about pricing and usage quotas for each tier, select View full pricing details.
    • 为此“语音”订阅创建新的资源组或将订阅分配到现有资源组。Create a new resource group for this Speech subscription or assign the subscription to an existing resource group. 资源组有助于使多种 Azure 订阅保持有序状态。Resource groups help you keep your various Azure subscriptions organized.
    • 选择“创建” 。Select Create. 系统随后会将你转到部署概述,并显示部署进度消息。This will take you to the deployment overview and display deployment progress messages.

部署新的语音资源需要花费片刻时间。It takes a few moments to deploy your new Speech resource. 部署完成后,选择“转到资源”,然后在左侧导航窗格中选择“密钥”以显示语音服务订阅密钥。 Once deployment is complete, select Go to resource and in the left navigation pane select Keys to display your Speech service subscription keys. 每个订阅有两个密钥;可在应用程序中使用任意一个密钥。Each subscription has two keys; you can use either key in your application. 若要将密钥快速复制/粘贴到代码编辑器或其他位置,请选择每个密钥旁边的复制按钮,切换窗口,然后将剪贴板中的内容粘贴到所需位置。To quickly copy/paste a key to your code editor or other location, select the copy button next to each key, switch windows to paste the clipboard contents to the desired location.


这些订阅密钥用于访问认知服务 API。These subscription keys are used to access your Cognitive Service API. 不要共享你的密钥。Do not share your keys. 安全存储密钥 - 例如,使用 Azure Key Vault。Store them securely– for example, using Azure Key Vault. 此外,我们建议定期重新生成这些密钥。We also recommend regenerating these keys regularly. 发出 API 调用只需一个密钥。Only one key is necessary to make an API call. 重新生成第一个密钥时,可以使用第二个密钥来持续访问服务。When regenerating the first key, you can use the second key for continued access to the service.

完成快速入门Complete a quickstart

我们提供了适用于大多数流行编程语言的快速入门,旨在让你了解基本设计模式并帮助你在 10 分钟以内运行代码。We offer quickstarts in most popular programming languages, each designed to teach you basic design patterns, and have you running code in less than 10 minutes. 请参阅以下列表,了解每项功能的快速入门。See the following list for the quickstart for each feature.

在你有机会开始使用语音服务后,请尝试一下我们的教程,了解如何处理各种情况。After you've had a chance to get started with the Speech service, try our tutorials that show you how to solve various scenarios.

获取示例代码Get sample code


需要语音 SDK 版本 1.11.0 或更高版本。Speech SDK version 1.11.0 or later is required.

GitHub 上提供了语音服务的示例代码。Sample code is available on GitHub for the Speech service. 这些示例涵盖了常见方案,例如,从文件或流中读取音频、连续和单次识别,以及使用自定义模型。These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition, and working with custom models. 使用以下链接查看 SDK 和 REST 示例:Use these links to view SDK and REST samples:

自定义语音体验Customize your speech experience

语音服务能够很好地与内置模型配合工作,但是,你可能想要根据自己的产品或环境,进一步自定义和优化体验。The Speech service works well with built-in models, however, you may want to further customize and tune the experience for your product or environment. 自定义选项的范围从声学模型优化,到专属于自有品牌的语音字体。Customization options range from acoustic model tuning to unique voice fonts for your brand.

其他产品提供了针对特定用途(如卫生保健或保险)而优化的语音模型,但可供所有人平等地使用。Other products offer speech models tuned for specific purposes like healthcare or insurance, but are available to everyone equally. Azure 语音的自定义功能将成为你的独特竞争优势部分,而其他任何用户或客户都无法使用。Customization in Azure Speech becomes part of your unique competitive advantage that is unavailable to any other user or customer. 换句话说,你的模型是私人的,仅针对你的用例进行自定义调整。In other words, your models are private and custom-tuned for your use-case only.

语音服务Speech Service 平台Platform 说明Description
语音转文本Speech-to-Text 自定义语音识别Custom Speech 根据需要和可用数据自定义语音识别模型。Customize speech recognition models to your needs and available data. 克服语音识别障碍,如说话风格、词汇和背景噪音。Overcome speech recognition barriers such as speaking style, vocabulary and background noise.

