语音转文本的自动语言检测Automatic language detection for speech to text

自动语言检测用于确定传递给语音 SDK 的音频在与一系列提供的语言进行比较时的最可能匹配。Automatic language detection is used to determine the most likely match for audio passed to the Speech SDK when compared against a list of provided languages. 然后,系统会使用自动语言检测返回的值来选择语音转文本的语言模型,为你提供更准确的听录。The value returned by automatic language detection is then used to select the language model for speech to text, providing you with a more accurate transcription. 若要查看哪些语言可用,请参阅语言支持To see which languages are available, see Language support.

本文介绍如何使用 AutoDetectSourceLanguageConfig 来构造 SpeechRecognizer 对象并检索检测到的语言。In this article, you'll learn how to use AutoDetectSourceLanguageConfig to construct a SpeechRecognizer object and retrieve the detected language.

重要

此功能仅适用于具有 C#、C++、Java、Python、JavaScript 和 Objective-C 的语音 SDK。This feature is only available for the Speech SDK with C#, C++, Java, Python, JavaScript and Objective-C.

使用语言 SDK 进行自动语言检测Automatic language detection with the Speech SDK

自动语言检测目前存在每次检测仅限四种语言的服务端限制。Automatic language detection currently has a services-side limit of four languages per detection. 在构造 AudoDetectSourceLanguageConfig 对象时,请牢记此限制。Keep this limitation in mind when construction your AudoDetectSourceLanguageConfig object. 在下面的示例中,我们将创建 AutoDetectSourceLanguageConfig,然后使用它来构造 SpeechRecognizerIn the samples below, you'll create an AutoDetectSourceLanguageConfig, then use it to construct a SpeechRecognizer.

提示

还可以指定一个自定义模型,供执行语音转文本操作时使用。You can also specify a custom model to use when performing speech to text. 有关详细信息,请参阅使用自定义模型来自动检测语言For more information, see Use a custom model for automatic language detection.

以下代码片段演示了如何在应用中使用自动语言检测:The following snippets illustrate how to use automatic language detection in your apps:

var autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.FromLanguages(
        new string[] { "en-US", "de-DE" });

using (var recognizer = new SpeechRecognizer(
    speechConfig,
    autoDetectSourceLanguageConfig,
    audioConfig))
{
    var speechRecognitionResult = await recognizer.RecognizeOnceAsync();
    var autoDetectSourceLanguageResult =
        AutoDetectSourceLanguageResult.FromResult(speechRecognitionResult);
    var detectedLanguage = autoDetectSourceLanguageResult.Language;
}
auto autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE" });

auto recognizer = SpeechRecognizer::FromConfig(
    speechConfig,
    autoDetectSourceLanguageConfig,
    audioConfig);

speechRecognitionResult = recognizer->RecognizeOnceAsync().get();
auto autoDetectSourceLanguageResult =
    AutoDetectSourceLanguageResult::FromResult(speechRecognitionResult);
auto detectedLanguage = autoDetectSourceLanguageResult->Language;
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE"));

SpeechRecognizer recognizer = new SpeechRecognizer(
    speechConfig,
    autoDetectSourceLanguageConfig,
    audioConfig);

Future<SpeechRecognitionResult> future = recognizer.recognizeOnceAsync();
SpeechRecognitionResult result = future.get(30, TimeUnit.SECONDS);
AutoDetectSourceLanguageResult autoDetectSourceLanguageResult =
    AutoDetectSourceLanguageResult.fromResult(result);
String detectedLanguage = autoDetectSourceLanguageResult.getLanguage();

recognizer.close();
speechConfig.close();
autoDetectSourceLanguageConfig.close();
audioConfig.close();
result.close();
auto_detect_source_language_config = \
        speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE"])
speech_recognizer = speechsdk.SpeechRecognizer(
        speech_config=speech_config, 
        auto_detect_source_language_config=auto_detect_source_language_config, 
        audio_config=audio_config)
result = speech_recognizer.recognize_once()
auto_detect_source_language_result = speechsdk.AutoDetectSourceLanguageResult(result)
detected_language = auto_detect_source_language_result.language
NSArray *languages = @[@"zh-CN", @"de-DE"];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
        [[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];
SPXSpeechRecognizer* speechRecognizer = \
        [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig
                           autoDetectSourceLanguageConfiguration:autoDetectSourceLanguageConfig
                                              audioConfiguration:audioConfig];
SPXSpeechRecognitionResult *result = [speechRecognizer recognizeOnce];
SPXAutoDetectSourceLanguageResult *languageDetectionResult = [[SPXAutoDetectSourceLanguageResult alloc] init:result];
NSString *detectedLanguage = [languageDetectionResult language];
var autoDetectConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages(["en-US", "de-DE"]);
var speechRecognizer = SpeechSDK.SpeechRecognizer.FromConfig(speechConfig, audioConfig, autoDetectConfig);
speechRecognizer.recognizeOnceAsync((result: SpeechSDK.SpeechRecognitionResult) => {
        var languageDetectionResult = SpeechSDK.AutoDetectSourceLanguageResult.fromResult(result);
        var detectedLanguage = languageDetectionResult.language;
},
{});

使用自定义模型来自动检测语言Use a custom model for automatic language detection

除了使用语音服务模型来检测语言,还可以指定自定义模型来增强识别功能。In addition to language detection using Speech service models, you can specify a custom model for enhanced recognition. 如果未提供自定义模型,服务会使用默认的语言模型。If a custom model isn't provided, the service will use the default language model.

以下代码片段演示了如何在调用语音服务时指定自定义模型。The snippets below illustrate how to specify a custom model in your call to the Speech service. 如果检测到的语言为 en-US,则使用默认模型。If the detected language is en-US, then the default model is used. 如果检测到的语言为 fr-FR,则使用自定义模型的终结点模型:If the detected language is fr-FR, then the endpoint for the custom model is used:

var sourceLanguageConfigs = new SourceLanguageConfig[]
{
    SourceLanguageConfig.FromLanguage("en-US"),
    SourceLanguageConfig.FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR")
};
var autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.FromSourceLanguageConfigs(
        sourceLanguageConfigs);
std::vector<std::shared_ptr<SourceLanguageConfig>> sourceLanguageConfigs;
sourceLanguageConfigs.push_back(
    SourceLanguageConfig::FromLanguage("en-US"));
sourceLanguageConfigs.push_back(
    SourceLanguageConfig::FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));

auto autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig::FromSourceLanguageConfigs(
        sourceLanguageConfigs);
List sourceLanguageConfigs = new ArrayList<SourceLanguageConfig>();
sourceLanguageConfigs.add(
    SourceLanguageConfig.fromLanguage("en-US"));
sourceLanguageConfigs.add(
    SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));

AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs(
        sourceLanguageConfigs);
 en_language_config = speechsdk.languageconfig.SourceLanguageConfig("en-US")
 fr_language_config = speechsdk.languageconfig.SourceLanguageConfig("fr-FR", "The Endpoint Id for custom model of fr-FR")
 auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(
        sourceLanguageConfigs=[en_language_config, fr_language_config])
SPXSourceLanguageConfiguration* enLanguageConfig = [[SPXSourceLanguageConfiguration alloc]init:@"en-US"];
SPXSourceLanguageConfiguration* frLanguageConfig = \
        [[SPXSourceLanguageConfiguration alloc]initWithLanguage:@"fr-FR"
                                                     endpointId:@"The Endpoint Id for custom model of fr-FR"];
NSArray *languageConfigs = @[enLanguageConfig, frLanguageConfig];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
        [[SPXAutoDetectSourceLanguageConfiguration alloc]initWithSourceLanguageConfigurations:languageConfigs];
var enLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("en-US");
var frLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR");
var autoDetectConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs([enLanguageConfig, frLanguageConfig]);

后续步骤Next steps

  • 有关自动语言检测,请参阅 GitHub 上的示例代码See the sample code on GitHub for automatic language detection
  • 有关自动语言检测,请参阅 GitHub 上的示例代码See the sample code on GitHub for automatic language detection
  • 有关自动语言检测,请参阅 GitHub 上的示例代码See the sample code on GitHub for automatic language detection
  • 有关自动语言检测,请参阅 GitHub 上的示例代码See the sample code on GitHub for automatic language detection
  • 有关自动语言检测,请参阅 GitHub 上的示例代码See the sample code on GitHub for automatic language detection