语音转文本的显示文本格式设置

2025/07/18

语音转文本提供一系列格式设置功能，确保转录的文本清晰易读。有关每个功能如何用于提高最终文本输出的整体清晰度的概述，请参阅以下部分。

ITN

反文本规范化（ITN）是一个将口头形式转换为其相应的符号书面形式的过程。例如，口语“四”转换为书面形式“4”。语音转文本服务可完成此过程，且不可配置。一些受支持的文本格式包括日期、时间、小数、货币、地址、电子邮件和电话号码。你可以自然说话，该服务会按预期设置文本格式。下表显示应用于文本输出的 ITN 规则。

已识别的语音	显示文本
`that will cost nine hundred dollars`	`That will cost $900.`
`my phone number is one eight hundred, four five six, eight nine ten`	`My phone number is 1-800-456-8910.`
`the time is six forty five p m`	`The time is 6:45 PM.`
`I live on thirty five lexington avenue`	`I live on 35 Lexington Ave.`
`the answer is six point five`	`The answer is 6.5.`
`send it to support at help dot com`	`Send it to support@help.com.`

大写

语音转文本模型可识别应大写的单词，以提高可读性、准确性和语法水平。例如，语音服务将在句子开头自动大写正确的名词和单词。下表显示了一些示例。

已识别的语音	显示文本
`i got an x l t shirt`	`I got an XL t-shirt.`
`my name is jennifer smith`	`My name is Jennifer Smith.`
`i want to visit new york city`	`I want to visit New York City.`

删除不流畅之处

说话时，有人经常口吃、重复单词，还会说“嗯”或“呃”之类的填充词。语音转文本可以识别这种不流畅的语流并将其从显示文本中删除。删除不流畅之处非常适合用于听录现场无稿演讲，以便稍后再阅读。下表显示了一些示例。

已识别的语音	显示文本
`i uh said that we can go to the uhmm movies`	`I said that we can go to the movies.`
`its its not that big of uhm a deal`	`It's not that big of a deal.`
`umm i think tomorrow should work`	`I think tomorrow should work.`

标点

语音转文本会自动为文本添加标点，让文本更加一目了然。标点符号有助于复述通话内容或对话听录。下表显示了一些示例。

已识别的语音	显示文本
`how are you`	`How are you?`
`we can go to the mall park or beach`	`We can go to the mall, park, or beach.`

使用语音转文本进行连续识别时，可以将语音服务配置为识别显式标点符号。然后你可大声说出标点符号，使文本更清晰易读。在你想要使用复杂标点符号而无需在之后添加的情况下，这尤其有用。下表显示了一些示例。

已识别的语音	显示文本
`they entered the room dot dot dot`	`They entered the room...`
`i heart emoji you period`	`I <3 you.`
`the options are apple forward slash banana forward slash orange period`	`The options are apple/banana/orange.`
`are you sure question mark`	`Are you sure?`

如果使用语音转文本进行连续识别，请使用语音 SDK 启用听写模式。此模式会促使语音配置实例解释对句子结构（如标点符号）进行的字面描述。

speechConfig.EnableDictation();

speechConfig->EnableDictation();

speechConfig.EnableDictation()

speechConfig.enableDictation();

speechConfig.enableDictation();

[self.speechConfig enableDictation];

self.speechConfig!.enableDictation()

speech_config.enable_dictation()

亵渎内容筛选器

可以指定是屏蔽、删除还是显示最终听录文本中的不雅内容。如果选择屏蔽不雅字词，可用星号 (*) 字符替代，这样可以保留文本的原始情绪，同时使其更适合某些情况

备注

Microsoft 还保留屏蔽或删除被视为不恰当的任何单词的权利。语音服务不会返回此类单词，无论是否启用了亵渎内容筛选器。

亵渎内容筛选器选项包括：

Masked：将不雅词语中的字母替换为星号 (*) 字符。屏蔽是默认选项。
Raw：包括不雅词语原义。
Removed：删除不雅词语。

例如，若要从语音识别结果中删除不雅词语，请将亵渎内容筛选器设置为 Removed，如下所示：

speechConfig.SetProfanity(ProfanityOption.Removed);

speechConfig->SetProfanity(ProfanityOption::Removed);

speechConfig.SetProfanity(common.Removed)

speechConfig.setProfanity(ProfanityOption.Removed);

speechConfig.setProfanity(sdk.ProfanityOption.Removed);

[self.speechConfig setProfanityOptionTo:SPXSpeechConfigProfanityOption.SPXSpeechConfigProfanityOption_ProfanityRemoved];

self.speechConfig!.setProfanityOptionTo(SPXSpeechConfigProfanityOption_ProfanityRemoved)

speech_config.set_profanity(speechsdk.ProfanityOption.Removed)

spx recognize --file caption.this.mp4 --format any --profanity masked --output vtt file - --output srt file -

亵渎内容筛选器应用于结果 Text 和 MaskedNormalizedForm 属性。亵渎内容筛选器不应用于结果 LexicalForm 和 NormalizedForm 属性。亵渎内容筛选器不应用于单词级别结果。

Microsoft Ignite

通过

ITN

大写

删除不流畅之处

标点

亵渎内容筛选器

通过

语音转文本的显示文本格式设置

ITN

大写

删除不流畅之处

标点

亵渎内容筛选器

相关内容

其他资源