什么是语音翻译?What is speech translation?

重要

现在,将对此服务的所有 HTTP 请求强制执行 TLS 1.2。TLS 1.2 is now enforced for all HTTP requests to this service.

使用语音服务提供的语音翻译,可以对音频流进行实时多语言语音转语音和语音转文本翻译。Speech translation from the Speech service enables real-time, multi-language speech-to-speech and speech-to-text translation of audio streams. 使用语音 SDK,应用程序、工具和设备可以访问所提供的音频的源听录和翻译输出。With the Speech SDK, your applications, tools, and devices have access to source transcriptions and translation outputs for provided audio. 检测到语音时,会返回过渡性的听录和翻译结果,最终结果可以转换为合成语音。Interim transcription and translation results are returned as speech is detected, and finals results can be converted into synthesized speech.

Microsoft 翻译引擎有两种不同的支持方法:统计机器翻译 (SMT) 和神经机器翻译 (NMT)。Microsoft's translation engine is powered by two different approaches: statistical machine translation (SMT) and neural machine translation (NMT). SMT 可以在给定上下文(数个单词)的情况下,使用高级统计分析来估计可能的最佳翻译。SMT uses advanced statistical analysis to estimate the best possible translations given the context of a few words. 使用 NMT 时,可以通过神经网络借助完整的语句上下文来翻译单词,这样可以提供更准确且听起来很自然的翻译。With NMT, neural networks are used to provide more accurate, natural-sounding translations by using the full context of sentences to translate words.

目前,对于大多数常用语言,Microsoft 使用 NMT 进行翻译。Today, Microsoft uses NMT for translation to most popular languages. NMT 支持所有可用于语音到语音转换的语言All languages available for speech-to-speech translation are powered by NMT. 语音到文本转换可能会使用 SMT 或 NMT,具体取决于语言对。Speech-to-text translation may use SMT or NMT depending on the language pair. 如果 NMT 支持目标语言,则 NMT 支持全译。When the target language is supported by NMT, the full translation is NMT-powered. 如果 NMT 不支持目标语言,则翻译是 NMT 和 SMT 的结合,将英语作为两种语言之间的“枢轴”。When the target language isn't supported by NMT, the translation is a hybrid of NMT and SMT, using English as a "pivot" between the two languages.

核心功能Core features

下面是可以通过语音 SDK 和 REST API 获得的功能:Here are the features available via the Speech SDK and REST APIs:

使用案例Use case SDKSDK RESTREST
包含识别结果的语音转文本翻译。Speech-to-text translation with recognition results. Yes No
语音转语音翻译。Speech-to-speech translation. Yes No
过渡性识别和翻译结果。Interim recognition and translation results. Yes No

语音翻译入门Get started with speech translation

我们专门提供了快速入门来帮助你在 10 分钟内运行代码。We offer quickstarts designed to have you running code in less than 10 minutes. 下表按语言列出了语音翻译快速入门。This table includes a list of speech translation quickstarts organized by language.

快速入门Quickstart 平台Platform API 参考API reference
C#、.NET CoreC#, .NET Core WindowsWindows “浏览”Browse
C#、.NET FrameworkC#, .NET Framework WindowsWindows “浏览”Browse
C#、UWPC#, UWP WindowsWindows “浏览”Browse
C++C++ WindowsWindows “浏览”Browse
JavaJava Windows、Linux、macOSWindows, Linux, macOS “浏览”Browse

代码示例Sample code

GitHub 上提供了语音 SDK 的示例代码。Sample code for the Speech SDK is available on GitHub. 这些示例涵盖了常见方案,例如,从文件或流中读取音频、连续和单次识别/翻译,以及使用自定义模型。These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition/translation, and working with custom models.

参考文档Reference docs

后续步骤Next steps