快速入门:将语音合成到扬声器Quickstart: Synthesize speech to a speaker

重要

需要语音 SDK 版本 1.11.0 或更高版本。Speech SDK version 1.11.0 or later is required.

在本快速入门中,你将使用语音 SDK 将文本转换为合成语音。In this quickstart, you will use the Speech SDK to convert text to synthesized speech. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. 满足几个先决条件后,将合成语音呈现到默认扬声器只需四个步骤:After satisfying a few prerequisites, rendering synthesized speech to the default speakers only takes four steps:

  • 通过订阅密钥和区域创建 SpeechConfig 对象。Create a SpeechConfig object from your subscription key and region.
  • 使用以上的 SpeechConfig 对象创建 SpeechSynthesizer 对象。Create a SpeechSynthesizer object using the SpeechConfig object from above.
  • 使用 SpeechSynthesizer 对象来朗读文本。Using the SpeechSynthesizer object to speak the text.
  • 检查返回的 SpeechSynthesisResult 中是否有错误。Check the SpeechSynthesisResult returned for errors.

如果希望直入正题,请在 GitHub 上查看或下载所有语音 SDK C# 示例If you prefer to jump right in, view or download all Speech SDK C# Samples on GitHub. 否则就开始吧!Otherwise, let's get started.

选择目标环境Choose your target environment

先决条件Prerequisites

在开始之前,请务必:Before you get started, make sure to:

添加示例代码Add sample code

  1. 打开 Program.cs 并使用此示例替换自动生成的代码:Open Program.cs and replace the automatically generated code with this sample:

    using System;
    using System.Threading.Tasks;
    using Microsoft.CognitiveServices.Speech;
    
    namespace helloworld
    {
        class Program
        {
            public static async Task SynthesisToSpeakerAsync()
            {
                // Creates an instance of a speech config with specified subscription key and service region.
                // Replace with your own subscription key and service region (e.g., "chinaeast2").
                // The default language is "en-us".
                var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
                // Creates a speech synthesizer using the default speaker as audio output.
                using (var synthesizer = new SpeechSynthesizer(config))
                {
                    // Receive a text from console input and synthesize it to speaker.
                    Console.WriteLine("Type some text that you want to speak...");
                    Console.Write("> ");
                    string text = Console.ReadLine();
    
                    using (var result = await synthesizer.SpeakTextAsync(text))
                    {
                        if (result.Reason == ResultReason.SynthesizingAudioCompleted)
                        {
                            Console.WriteLine($"Speech synthesized to speaker for text [{text}]");
                        }
                        else if (result.Reason == ResultReason.Canceled)
                        {
                            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
                            Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
    
                            if (cancellation.Reason == CancellationReason.Error)
                            {
                                Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                                Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                                Console.WriteLine($"CANCELED: Did you update the subscription info?");
                            }
                        }
                    }
    
                    // This is to give some time for the speaker to finish playing back the audio
                    Console.WriteLine("Press any key to exit...");
                    Console.ReadKey();
                }
            }
    
            static void Main()
            {
                SynthesisToSpeakerAsync().Wait();
            }
        }
    }
    
  2. 找到字符串 YourSubscriptionKey 并将其替换为语音服务订阅密钥。Find the string YourSubscriptionKey, and replace it with your Speech service subscription key.

  3. 查找字符串 YourServiceRegion,并将其替换为与订阅关联的区域Find the string YourServiceRegion, and replace it with the region associated with your subscription. 例如,如果使用的是试用订阅,则区域是 chinaeast2For example, if you're using the trial subscription, the region is chinaeast2.

  4. 在菜单栏中,选择“文件” > “全部保存”。From the menu bar, choose File > Save All.

生成并运行应用程序Build and run the application

  1. 从菜单栏中,选择“构建” > “构建解决方案”以构建应用程序。From the menu bar, choose Build > Build Solution to build the application. 现在,编译代码时应不会提示错误。The code should compile without errors now.

  2. 选择“调试” > “开始调试”(或选择 F5)以启动 helloworld 应用程序。Choose Debug > Start Debugging (or select F5) to start the helloworld application.

  3. 输入一个英语短语或句子。Enter an English phrase or sentence. 应用程序将你的文本传输到语音服务,该服务会将合成的语音发送到应用程序以在你的扬声器上播放。The application transmits your text to the Speech service, which sends synthesized speech to the application to play on your speaker.

    语音合成用户界面

后续步骤Next steps

了解语音合成的这些基本知识后,请继续浏览基础知识,了解语音 SDK 中的常见功能和任务。With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

在本快速入门中,你将使用语音 SDK 将文本转换为合成语音。In this quickstart, you will use the Speech SDK to convert text to synthesized speech. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. 满足几个先决条件后,将合成语音呈现到默认扬声器只需四个步骤:After satisfying a few prerequisites, rendering synthesized speech to the default speakers only takes four steps:

  • 通过订阅密钥和区域创建 SpeechConfig 对象。Create a SpeechConfig object from your subscription key and region.
  • 使用以上的 SpeechConfig 对象创建 SpeechSynthesizer 对象。Create a SpeechSynthesizer object using the SpeechConfig object from above.
  • 使用 SpeechSynthesizer 对象来朗读文本。Using the SpeechSynthesizer object to speak the text.
  • 检查返回的 SpeechSynthesisResult 中是否有错误。Check the SpeechSynthesisResult returned for errors.

如果希望直入正题,请在 GitHub 上查看或下载所有语音 SDK C++ 示例If you prefer to jump right in, view or download all Speech SDK C++ Samples on GitHub. 否则就开始吧!Otherwise, let's get started.

选择目标环境Choose your target environment

先决条件Prerequisites

在开始之前,请务必:Before you get started, make sure to:

添加示例代码Add sample code

  1. 创建一个名为 helloworld.cpp 的 C++ 源文件,并将以下代码粘贴到其中。Create a C++ source file named helloworld.cpp, and paste the following code into it.
  #include <iostream> // cin, cout
  #include <speechapi_cxx.h>
  
  using namespace std;
  using namespace Microsoft::CognitiveServices::Speech;
  
  void synthesizeSpeech()
  {
      // Creates an instance of a speech config with specified subscription key and service region.
      // Replace with your own subscription key and service region (e.g., "chinaeast2").
      auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
  
      // Creates a speech synthesizer using the default speaker as audio output. The default spoken language is "en-us".
      auto synthesizer = SpeechSynthesizer::FromConfig(config);
  
      // Receive a text from console input and synthesize it to speaker.
      cout << "Type some text that you want to speak..." << std::endl;
      cout << "> ";
      std::string text;
      getline(cin, text);
  
      auto result = synthesizer->SpeakTextAsync(text).get();
  
      // Checks result.
      if (result->Reason == ResultReason::SynthesizingAudioCompleted)
      {
          cout << "Speech synthesized to speaker for text [" << text << "]" << std::endl;
      }
      else if (result->Reason == ResultReason::Canceled)
      {
          auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
          cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
  
          if (cancellation->Reason == CancellationReason::Error)
          {
              cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
              cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
              cout << "CANCELED: Did you update the subscription info?" << std::endl;
          }
      }
  
      // This is to give some time for the speaker to finish playing back the audio
      cout << "Press enter to exit..." << std::endl;
      cin.get();
  }
  
  int main(int argc, char **argv) {
      setlocale(LC_ALL, "");
      synthesizeSpeech();
      return 0;
  }
  1. 在此新文件中,将字符串 YourSubscriptionKey 替换为你的语音服务订阅密钥。In this new file, replace the string YourSubscriptionKey with your Speech service subscription key.

  2. 将字符串 YourServiceRegion 替换为与订阅关联的区域(例如,对于试用订阅,为 chinaeast2)。Replace the string YourServiceRegion with the region associated with your subscription (for example, chinaeast2 for the trial subscription).

生成应用Build the app

备注

请确保将以下命令输入在单个命令行上。 Make sure to enter the commands below as a single command line. 执行该操作的最简单方法是使用每个命令旁边的“复制按钮”来复制命令,然后将其粘贴到 shell 提示符下。 The easiest way to do that is to copy the command by using the Copy button next to each command, and then paste it at your shell prompt.

  • x64(64 位)系统上,运行以下命令来生成应用程序。On an x64 (64-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x64" -l:libasound.so.2
    
  • x86(32 位)系统上,运行以下命令来生成应用程序。On an x86 (32-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x86" -l:libasound.so.2
    
  • ARM64(64 位)系统上,运行以下命令生成应用程序。On an ARM64 (64-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/arm64" -l:libasound.so.2
    

运行应用程序Run the app

  1. 将加载程序的库路径配置为指向语音 SDK 库。Configure the loader's library path to point to the Speech SDK library.

    • x64(64 位)系统上,输入以下命令。On an x64 (64-bit) system, enter the following command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x64"
      
    • x86(32 位)系统上,输入以下命令。On an x86 (32-bit) system, enter this command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x86"
      
    • ARM64(64 位)系统上,输入以下命令。On an ARM64 (64-bit) system, enter the following command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/arm64"
      
  2. 运行应用程序。Run the application.

    ./helloworld
    
  3. 在控制台窗口中,会出现一个提示,提示你键入一些文本。In the console window, a prompt appears, prompting you to type some text. 键入几个单词或一个句子。Type a few words or a sentence. 键入的文本将传输到语音服务,并合成为语音,在扬声器上播放。The text that you typed is transmitted to the Speech service and synthesized to speech, which plays on your speaker.

    Type some text that you want to speak...
    > hello
    Speech synthesized to speaker for text [hello]
    Press enter to exit...
    

后续步骤Next steps

了解语音合成的这些基本知识后,请继续浏览基础知识,了解语音 SDK 中的常见功能和任务。With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

在本快速入门中,你将使用语音 SDK 将文本转换为合成语音。In this quickstart, you will use the Speech SDK to convert text to synthesized speech. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. 满足几个先决条件后,将合成语音呈现到默认扬声器只需四个步骤:After satisfying a few prerequisites, rendering synthesized speech to the default speakers only takes four steps:

  • 通过订阅密钥和区域创建 SpeechConfig 对象。Create a SpeechConfig object from your subscription key and region.
  • 使用以上的 SpeechConfig 对象创建 SpeechSynthesizer 对象。Create a SpeechSynthesizer object using the SpeechConfig object from above.
  • 使用 SpeechSynthesizer 对象来朗读文本。Using the SpeechSynthesizer object to speak the text.
  • 检查返回的 SpeechSynthesisResult 中是否有错误。Check the SpeechSynthesisResult returned for errors.

如果希望直入正题,请在 GitHub 上查看或下载所有语音 SDK Java 示例If you prefer to jump right in, view or download all Speech SDK Java Samples on GitHub. 否则就开始吧!Otherwise, let's get started.

选择目标环境Choose your target environment

先决条件Prerequisites

在开始之前,请务必:Before you get started, make sure to:

添加示例代码Add sample code

  1. 若要向 Java 项目添加新的空类,请选择“文件” > “新建” > “类”。 To add a new empty class to your Java project, select File > New > Class.

  2. 在“新建 Java 类”窗口中,在“包”字段内输入 speechsdk.quickstart,在“名称”字段内输入 MainIn the New Java Class window, enter speechsdk.quickstart into the Package field, and Main into the Name field.

    “新建 Java 类”窗口的屏幕截图

  3. Main.java 中的所有代码替换为以下代码片段:Replace all code in Main.java with the following snippet:

    package speechsdk.quickstart;
    
    import java.util.Scanner;
    import java.util.concurrent.Future;
    import com.microsoft.cognitiveservices.speech.*;
    
    /**
     * Quickstart: synthesize speech using the Speech SDK for Java.
     */
    public class Main {
    
        /**
         * @param args Arguments are ignored in this sample.
         */
        public static void main(String[] args) {
            try {
                // Replace below with your own subscription key
                String speechSubscriptionKey = "YourSubscriptionKey";
    
                int exitCode = 1;
                // Replace below with your own service region (e.g., "chinaeast2").
                SpeechConfig config = SpeechConfig.fromSubscription("YourSubscriptionKey", "YourServiceRegion");
                assert(config != null);
    
                SpeechSynthesizer synth = new SpeechSynthesizer(config);
                assert(synth != null);
    
                System.out.println("Type some text that you want to speak...");
                System.out.print("> ");
                String text = new Scanner(System.in).nextLine();
    
                Future<SpeechSynthesisResult> task = synth.SpeakTextAsync(text);
                assert(task != null);
    
                SpeechSynthesisResult result = task.get();
                assert(result != null);
    
                if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                    System.out.println("Speech synthesized to speaker for text [" + text + "]");
                    exitCode = 0;
                }
                else if (result.getReason() == ResultReason.Canceled) {
                    SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(result);
                    System.out.println("CANCELED: Reason=" + cancellation.getReason());
    
                    if (cancellation.getReason() == CancellationReason.Error) {
                        System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                        System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                        System.out.println("CANCELED: Did you update the subscription info?");
                    }
                }
    
                result.close();
                synth.close();
    
                System.exit(exitCode);
            } catch (Exception ex) {
                System.out.println("Unexpected exception: " + ex.getMessage());
    
                assert(false);
                System.exit(1);
            }
        }
    }
    
  4. 将字符串 YourSubscriptionKey 替换为你的订阅密钥。Replace the string YourSubscriptionKey with your subscription key.

  5. 将字符串 YourServiceRegion 替换为与订阅关联的区域(例如,对于试用订阅,为 chinaeast2)。Replace the string YourServiceRegion with the region associated with your subscription (for example, chinaeast2 for the trial subscription).

  6. 保存对项目的更改。Save changes to the project.

生成并运行应用Build and run the app

按 F11,或选择“运行” > “调试”。 Press F11, or select Run > Debug. 出现提示时输入文本,你将听到从默认扬声器播放的合成音频。Input a text when prompted, and you will hear the synthesized audio played from default speaker.

后续步骤Next steps

了解语音合成的这些基本知识后,请继续浏览基础知识,了解语音 SDK 中的常见功能和任务。With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

在本快速入门中,你将使用语音 SDK 将文本转换为合成语音。In this quickstart, you will use the Speech SDK to convert text to synthesized speech. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. 满足几个先决条件后,将合成语音呈现到默认扬声器只需四个步骤:After satisfying a few prerequisites, rendering synthesized speech to the default speakers only takes four steps:

  • 通过订阅密钥和区域创建 SpeechConfig 对象。Create a SpeechConfig object from your subscription key and region.
  • 使用以上的 SpeechConfig 对象创建 SpeechSynthesizer 对象。Create a SpeechSynthesizer object using the SpeechConfig object from above.
  • 使用 SpeechSynthesizer 对象来朗读文本。Using the SpeechSynthesizer object to speak the text.
  • 检查返回的 SpeechSynthesisResult 中是否有错误。Check the SpeechSynthesisResult returned for errors.

如果希望直入正题,请在 GitHub 上查看或下载所有语音 SDK Python 示例If you prefer to jump right in, view or download all Speech SDK Python Samples on GitHub. 否则就开始吧!Otherwise, let's get started.

先决条件Prerequisites

在开始之前,请务必:Before you get started, make sure to:

支持和更新Support and updates

语音 SDK Python 包的更新将通过 PyPI 分发,发行说明中会发布相关通告。Updates to the Speech SDK Python package are distributed via PyPI and announced in the Release notes. 如果有新版本可用,可以使用 pip install --upgrade azure-cognitiveservices-speech 命令进行更新。If a new version is available, you can update to it with the command pip install --upgrade azure-cognitiveservices-speech. 通过查看 azure.cognitiveservices.speech.__version__ 变量来检查当前安装的版本。Check which version is currently installed by inspecting the azure.cognitiveservices.speech.__version__ variable.

如果遇到问题或者缺少某项功能,请查看支持和帮助选项If you have a problem, or you're missing a feature, see Support and help options.

使用语音 SDK 创建 Python 应用程序Create a Python application that uses the Speech SDK

运行示例Run the sample

可将本快速入门中的示例代码复制到源文件 quickstart.py,然后在 IDE 或控制台中运行该代码You can copy the sample code from this quickstart to a source file quickstart.py and run it in your IDE or in the console:

python quickstart.py

或者,可以从语音 SDK 示例存储库Jupyter Notebook 的形式下载本快速入门教程,并将其作为 Notebook 运行。Or you can download this quickstart tutorial as a Jupyter notebook from the Speech SDK sample repository and run it as a notebook.

代码示例Sample code

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and service region (e.g., "chinaeast2").
speech_host, speech_key = "https://YourServiceRegion.tts.speech.azure.cn/", "YourSubscriptionKey"
speech_config = speechsdk.SpeechConfig(host=speech_host, subscription=speech_key)

# Creates a speech synthesizer using the default speaker as audio output.
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)

# Receives a text from console input.
print("Type some text that you want to speak...")
text = input()

# Synthesizes the received text to speech.
# The synthesized speech is expected to be heard on the speaker with this line executed.
result = speech_synthesizer.speak_text_async(text).get()

# Checks result.
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized to speaker for text [{}]".format(text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
    print("Did you update the subscription info?")

通过 Visual Studio Code 安装并使用语音 SDKInstall and use the Speech SDK with Visual Studio Code

  1. 在计算机上下载并安装 64 位版本的 Python(3.5 到 3.8)。Download and install a 64-bit version of Python, 3.5 to 3.8, on your computer.

  2. 下载并安装 Visual Studio CodeDownload and install Visual Studio Code.

  3. 打开 Visual Studio Code 并安装 Python 扩展。Open Visual Studio Code and install the Python extension. 在菜单中选择“文件” > “首选项” > “扩展”。Select File > Preferences > Extensions from the menu. 搜索 PythonSearch for Python.

    安装 Python 扩展

  4. 创建一个文件夹用于存储项目。Create a folder to store the project in. 例如,使用 Windows 资源管理器。An example is by using Windows Explorer.

  5. 在 Visual Studio Code 中选择“文件”图标。In Visual Studio Code, select the File icon. 然后打开创建的文件夹。Then open the folder you created.

    打开文件夹

  6. 选择“新建文件”图标创建新的 Python 源文件 speechsdk.pyCreate a new Python source file, speechsdk.py, by selecting the new file icon.

    创建文件

  7. 复制 Python 代码并将其粘贴到新建的文件,然后保存文件。Copy, paste, and save the Python code to the newly created file.

  8. 插入语音服务订阅信息。Insert your Speech service subscription information.

  9. 如果已选择 Python 解释器,窗口底部的状态栏左侧会显示它。If selected, a Python interpreter displays on the left side of the status bar at the bottom of the window. 否则,会显示可用 Python 解释器的列表。Otherwise, bring up a list of available Python interpreters. 打开命令面板 (Ctrl+Shift+P) 并输入 Python:Select InterpreterOpen the command palette (Ctrl+Shift+P) and enter Python: Select Interpreter. 选择适当的解释器。Choose an appropriate one.

  10. 如果尚未为所选的 Python 解释器安装,You can install the Speech SDK Python package from within Visual Studio Code. 可以从 Visual Studio Code 内部安装语音 SDK Python 包。Do that if it's not installed yet for the Python interpreter you selected. 若要安装语音 SDK 包,请打开终端。To install the Speech SDK package, open a terminal. 再次启动命令面板 (Ctrl+Shift+P) 并输入 Terminal:Create New Integrated Terminal 来打开终端。Bring up the command palette again (Ctrl+Shift+P) and enter Terminal: Create New Integrated Terminal. 在打开的终端中,输入命令 python -m pip install azure-cognitiveservices-speech,或者输入适用于系统的命令。In the terminal that opens, enter the command python -m pip install azure-cognitiveservices-speech or the appropriate command for your system.

  11. 若要运行示例代码,请在编辑器中的某个位置单击右键。To run the sample code, right-click somewhere inside the editor. 选择“在终端中运行 Python 文件”。Select Run Python File in Terminal. 在出现提示时键入一些文本。Type some text when you're prompted. 合成的音频稍后播放。The synthesized audio is played shortly afterward.

    运行示例

如果在遵照这些说明操作时遇到问题,请参阅内容更全面的 Visual Studio Code Python 教程If you have issues following these instructions, refer to the more extensive Visual Studio Code Python tutorial.

后续步骤Next steps

了解语音合成的这些基本知识后,请继续浏览基础知识,了解语音 SDK 中的常见功能和任务。With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

在本快速入门中,你将使用语音 SDK 将文本转换为合成语音。In this quickstart, you will use the Speech SDK to convert text to synthesized speech. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. 满足几个先决条件后,将合成语音呈现到默认扬声器只需四个步骤:After satisfying a few prerequisites, rendering synthesized speech to the default speakers only takes four steps:

  • 通过订阅密钥和区域创建 SpeechConfig 对象。Create a SpeechConfig object from your subscription key and region.
  • 使用以上的 SpeechConfig 对象创建 SpeechSynthesizer 对象。Create a SpeechSynthesizer object using the SpeechConfig object from above.
  • 使用 SpeechSynthesizer 对象来朗读文本。Using the SpeechSynthesizer object to speak the text.
  • 检查返回的 SpeechSynthesisResult 中是否有错误。Check the SpeechSynthesisResult returned for errors.

如果希望直入正题,请在 GitHub 上查看或下载所有语音 SDK JavaScript 示例If you prefer to jump right in, view or download all Speech SDK JavaScript Samples on GitHub. 否则就开始吧!Otherwise, let's get started.

先决条件Prerequisites

准备工作:Before you get started:

新建网站文件夹Create a new Website folder

新建空文件夹。Create a new, empty folder. 如果要在 web 服务器上承载示例,请确保 web 服务器可访问文件夹。In case you want to host the sample on a web server, make sure that the web server can access the folder.

将 JavaScript 的语音 SDK 解压缩到文件夹Unpack the Speech SDK for JavaScript into that folder

将语音 SDK 作为 .zip 包下载,并将其解压缩到新建文件夹。Download the Speech SDK as a .zip package and unpack it into the newly created folder. 这导致两个文件(microsoft.cognitiveservices.speech.sdk.bundle.jsmicrosoft.cognitiveservices.speech.sdk.bundle.js.map)被解压缩。This results in two files being unpacked, microsoft.cognitiveservices.speech.sdk.bundle.js and microsoft.cognitiveservices.speech.sdk.bundle.js.map. 后一个文件是可选的,可用于调试到 SDK 代码中。The latter file is optional, and is useful for debugging into the SDK code.

创建 index.html 页面Create an index.html page

在文件夹中创建名为 index.html 的新文件,使用文本编辑器打开此文件。Create a new file in the folder, named index.html and open this file with a text editor.

  1. 创建以下 HTML 框架:Create the following HTML skeleton:

    <!DOCTYPE html>
    <html>
    <head>
      <title>Microsoft Cognitive Services Speech SDK JavaScript Quickstart</title>
      <meta charset="utf-8" />
    </head>
    <body style="font-family:'Helvetica Neue',Helvetica,Arial,sans-serif; font-size:13px;">
      <!-- <uidiv> -->
      <div id="warning">
        <h1 style="font-weight:500;">Speech Recognition Speech SDK not found (microsoft.cognitiveservices.speech.sdk.bundle.js missing).</h1>
      </div>
    
      <div id="content" style="display:none">
        <table width="100%">
          <tr>
            <td></td>
            <td><h1 style="font-weight:500;">Microsoft Cognitive Services Speech SDK JavaScript Quickstart</h1></td>
          </tr>
          <tr>
            <td align="right"><a href="https://docs.azure.cn/cognitive-services/speech-service/get-started" target="_blank">Subscription</a>:</td>
            <td><input id="subscriptionKey" type="text" size="40" value="subscription"></td>
          </tr>
          <tr>
            <td align="right">Region</td>
            <td><input id="serviceRegion" type="text" size="40" value="YourServiceRegion"></td>
          </tr>
          <tr>
            <td></td>
            <td><button id="startSpeakTextAsyncButton">Start text to speech</button></td>
          </tr>
          <tr>
            <td align="right" valign="top">Input Text</td>
            <td><textarea id="phraseDiv" style="display: inline-block;width:500px;height:200px"></textarea></td>
          </tr>
          <tr>
            <td align="right" valign="top">Result</td>
            <td><textarea id="resultDiv" style="display: inline-block;width:500px;height:100px"></textarea></td>
          </tr>
        </table>
      </div>
      <!-- </uidiv> -->
    
      <!-- <speechsdkref> -->
      <!-- Speech SDK reference sdk. -->
      <script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>
      <!-- </speechsdkref> -->
    
      <!-- <authorizationfunction> -->
      <!-- Speech SDK Authorization token -->
      <script>
      // Note: Replace the URL with a valid endpoint to retrieve
      //       authorization tokens for your subscription.
      var authorizationEndpoint = "token.php";
    
      function RequestAuthorizationToken() {
        if (authorizationEndpoint) {
          var a = new XMLHttpRequest();
          a.open("GET", authorizationEndpoint);
          a.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
          a.send("");
          a.onload = function() {
            var token = JSON.parse(atob(this.responseText.split(".")[1]));
            serviceRegion.value = token.region;
            authorizationToken = this.responseText;
            subscriptionKey.disabled = true;
            subscriptionKey.value = "using authorization token (hit F5 to refresh)";
            console.log("Got an authorization token: " + token);
          }
        }
      }
      </script>
      <!-- </authorizationfunction> -->
    
      <!-- <quickstartcode> -->
      <!-- Speech SDK USAGE -->
      <script>
    
        // status fields and start button in UI
        var phraseDiv;
        var resultDiv;
        var startSpeakTextAsyncButton;
    
        // subscription key and region for speech services.
        var subscriptionKey, serviceRegion;
        var authorizationToken;
        var SpeechSDK;
        var synthesizer;
    
        document.addEventListener("DOMContentLoaded", function () {
          startSpeakTextAsyncButton = document.getElementById("startSpeakTextAsyncButton");
          subscriptionKey = document.getElementById("subscriptionKey");
          serviceRegion = document.getElementById("serviceRegion");
          phraseDiv = document.getElementById("phraseDiv");
          resultDiv = document.getElementById("resultDiv");
    
          startSpeakTextAsyncButton.addEventListener("click", function () {
            var soundContext = undefined;
            try {
              var AudioContext = window.AudioContext || window.webkitAudioContext || false;
              if (AudioContext) {
                soundContext = new AudioContext();
              } else {
                alert("AudioContext not supported");
              }
            }
            catch(e){
              window.console.log("no sound context found, no audio output. " + e);
            }
    
            startSpeakTextAsyncButton.disabled = true;
            phraseDiv.innerHTML = "";
    
            // if we got an authorization token, use the token. Otherwise use the provided subscription key
            var speechConfig;
            if (authorizationToken) {
              speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(authorizationToken, serviceRegion.value);
            } else {
              if (subscriptionKey.value === "" || subscriptionKey.value === "subscription") {
                alert("Please enter your Microsoft Cognitive Services Speech subscription key!");
                startSpeakTextAsyncButton.disabled = false;
                return;
              }
              speechConfig = SpeechSDK.SpeechConfig.fromSubscription(subscriptionKey.value, serviceRegion.value);
            }
    
            synthesizer = new SpeechSDK.SpeechSynthesizer(speechConfig);
    
            var inputText = phraseDiv.value;
            synthesizer.speakTextAsync(
              inputText,      
              function (result) {
                startSpeakTextAsyncButton.disabled = false;
                resultDiv.innerHTML += "Result: ";
                resultDiv.innerHTML += result.text;
                resultDiv.innerHTML += "\n";
    
                window.console.log(result);
                if (result.audioData && soundContext) {
                  var source = soundContext.createBufferSource();
                  soundContext.decodeAudioData(result.audioData, function (newBuffer) {
                    source.buffer = newBuffer;
                    source.connect(soundContext.destination);
                    source.start(0);
                  });
                }
    
                synthesizer.close();
                synthesizer = undefined;
              },
              function (err) {
                startSpeakTextAsyncButton.disabled = false;
                resultDiv.innerHTML += "Error: ";
                resultDiv.innerHTML += err;
                resultDiv.innerHTML += "\n";
                window.console.log(err);
    
                synthesizer.close();
                synthesizer = undefined;
              });
          });
    
          if (!!window.SpeechSDK) {
            SpeechSDK = window.SpeechSDK;
            startSpeakTextAsyncButton.disabled = false;
    
            document.getElementById('content').style.display = 'block';
            document.getElementById('warning').style.display = 'none';
    
            // in case we have a function for getting an authorization token, call it.
            if (typeof RequestAuthorizationToken === "function") {
              RequestAuthorizationToken();
            }
          }
        });
    
    
      </script>
      <!-- </quickstartcode> -->
    </body>
    </html>
    

创建令牌源(可选)Create the token source (optional)

如果要在 web 服务器上承载网页,可以为演示应用程序提供令牌源。In case you want to host the web page on a web server, you can optionally provide a token source for your demo application. 这样一来,订阅密钥永远不会离开服务器,并且用户可以在不输入任何授权代码的情况下使用语音功能。That way, your subscription key will never leave your server while allowing users to use speech capabilities without entering any authorization code themselves.

创建名为 token.php 的新文件。Create a new file named token.php. 此示例假设 web 服务器支持 PHP 脚本语言。In this example we assume your web server supports the PHP scripting language. 输入以下代码:Enter the following code:

<?php
header('Access-Control-Allow-Origin: ' . $_SERVER['SERVER_NAME']);

// Replace with your own subscription key and service region (e.g., "chinaeast2").
$subscriptionKey = 'YourSubscriptionKey';
$region = 'YourServiceRegion';

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://' . $region . '.api.cognitive.azure.cn/sts/v1.0/issueToken');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, '{}');
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json', 'Ocp-Apim-Subscription-Key: ' . $subscriptionKey));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
echo curl_exec($ch);
?>

备注

授权令牌仅具有有限的生存期。Authorization tokens only have a limited lifetime. 此简化示例不显示如何自动刷新授权令牌。This simplified example does not show how to refresh authorization tokens automatically. 作为用户,你可以手动重载页面或点击 F5 刷新。As a user, you can manually reload the page or hit F5 to refresh.

在本地生成和运行示例Build and run the sample locally

要启动应用,双击 index.html 文件或使用你喜欢的 web 浏览器打开 index.html。To launch the app, double-click on the index.html file or open index.html with your favorite web browser. 它会提供一个简单的 GUI,允许你输入订阅密钥和区域,并触发输入文本的合成。It will present a simple GUI allowing you to enter your subscription key and region and trigger synthesis of the input text.

通过 web 服务器生成并运行示例Build and run the sample via a web server

若要启动应用,请打开你喜爱的 Web 浏览器并将其指向托管文件夹的公共 URL,输入区域,然后触发输入文本的合成。To launch your app, open your favorite web browser and point it to the public URL that you host the folder on, enter your region, and trigger synthesis of the input text. 配置后,它将获取令牌源中的令牌。If configured, it will acquire a token from your token source.

后续步骤Next steps

了解语音合成的这些基本知识后,请继续浏览基础知识,了解语音 SDK 中的常见功能和任务。With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

在本快速入门中,你将使用命令行中的语音 CLI 将文本转换为从计算机的扬声器听到的语音。In this quickstart, you use the Speech CLI from the command line to convert text to speech you hear from your computer's audio speaker. 文本转语音语言支持下,文本转语音服务为合成语音提供了多种选项。The text-to-speech service provides many options for synthesized voices, under text-to-speech language support. 经过一次性配置后,可以通过语音 CLI 使用命令行中的命令从文本合成语音。After a one-time configuration, the Speech CLI lets you synthesize speech from text using commands from the command line.

先决条件Prerequisites

唯一先决条件是要有一个 Azure 语音订阅。The only prerequisite is an Azure Speech subscription. 如果还没有订阅,请参阅指南了解如何新建订阅。See the guide on creating a new subscription if you don't already have one.

下载并安装Download and install

按照以下步骤在 Windows 上安装语音 CLI:Follow these steps to install the Speech CLI on Windows:

  1. 安装 .NET Framework 4.7.NET Core 3.0Install either .NET Framework 4.7 or .NET Core 3.0
  2. 下载语音 CLI zip 存档然后提取它。Download the Speech CLI zip archive, then extract it.
  3. 转到从下载中提取的根目录 spx-zips,并提取所需的子目录(spx-net471 用于 .NET Framework 4.7,spx-netcore-win-x64 用于 x64 CPU 上的 .NET Core 3.0)。Go to the root directory spx-zips that you extracted from the download, and extract the subdirectory that you need (spx-net471 for .NET Framework 4.7, or spx-netcore-win-x64 for .NET Core 3.0 on an x64 CPU).

在命令提示符中,将目录更改到此位置,然后键入 spx 查看语音 CLI 的帮助。In the command prompt, change directory to this location, and then type spx to see help for the Speech CLI.

创建订阅配置Create subscription config

若要开始使用语音 CLI,首先需要输入语音订阅密钥和区域信息。To start using the Speech CLI, you first need to enter your Speech subscription key and region information. 请查看区域支持页,找到你的区域标识符。See the region support page to find your region identifier. 获得订阅密钥和区域标识符后(例如Once you have your subscription key and region identifier (ex. chinaeast2),运行以下命令。chinaeast2), run the following commands.

spx config @key --set YOUR-SUBSCRIPTION-KEY
spx config @region --set YOUR-REGION-ID

现在会存储订阅身份验证,用于将来的 SPX 请求。Your subscription authentication is now stored for future SPX requests. 如果需要删除这些已存储值中的任何一个,请运行 spx config @region --clearspx config @key --clearIf you need to remove either of these stored values, run spx config @region --clear or spx config @key --clear.

运行语音 CLIRun the Speech CLI

现在,可以运行语音 CLI,以从文本合成语音。Now you're ready to run the Speech CLI to synthesize speech from text.

在命令行中,更改为包含语音 CLI 二进制文件的目录,然后键入:From the command line, change to the directory that contains the Speech CLI binary file, and type:

spx synthesize --text "The speech synthesizer greets you!"

语音 CLI 将通过计算机扬声器生成英语的自然语言。The Speech CLI will produce natural language in English through the computer speaker.

后续步骤Next steps

继续浏览基础知识,了解语音 CLI 的其他功能。Continue exploring the basics to learn about other features of the Speech CLI.

查看或下载 GitHub 上所有的语音 SDK 示例View or download all Speech SDK Samples on GitHub.

其他语言和平台支持Additional language and platform support

重要

需要语音 SDK 1.11.0 或更高版本。Speech SDK version 1.11.0 or later is required.

如果已单击此选项卡,则可能看不到采用你偏好的编程语言的快速入门。If you've clicked this tab, you probably didn't see a quickstart in your favorite programming language. 别担心,我们在 GitHub 上提供了其他快速入门材料和代码示例。Don't worry, we have additional quickstart materials and code samples available on GitHub. 使用表格查找适用于编程语言和平台/OS 组合的相应示例。Use the table to find the right sample for your programming language and platform/OS combination.

语言Language 其他快速入门Additional Quickstarts 代码示例Code samples
C#C# 到音频文件To an audio file .NET Framework.NET CoreUWPUnityXamarin.NET Framework, .NET Core, UWP, Unity, Xamarin
C++C++ 到音频文件To an audio file WindowsLinuxmacOSWindows, Linux, macOS
JavaJava 到音频文件To an audio file AndroidJREAndroid, JRE
JavascriptJavaScript Node.js 到音频文件Node.js to an audio file Windows、Linux 和 macOSWindows, Linux, macOS
Objective-CObjective-C iOS 到扬声器macOS 到扬声器iOS to speaker, macOS to speaker iOSmacOSiOS, macOS
PythonPython 到音频文件To an audio file Windows、Linux 和 macOSWindows, Linux, macOS
SwiftSwift iOS 到扬声器macOS 到扬声器iOS to speaker, macOS to speaker iOSmacOSiOS, macOS