在 Speech Studio 中使用音频文件测试模型Test a model using an audio file in Speech Studio

在本操作说明中,你将使用 Speech Studio 将音频文件中的语音转换为文本。In this how-to, you use Speech Studio to convert speech from an audio file to text. 使用 Speech Studio,你可以使用相关文本、带有用户标记的听录内容的音频和你提供的发音指导来测试、比较、改进和部署语音识别模型。Speech Studio lets you test, compare, improve, and deploy speech recognition models using related text, audio with human-labeled transcripts, and pronunciation guidance you provide.

先决条件Prerequisites

使用语音门户之前,按照以下说明创建 Azure 帐户,并订阅语音服务Before you use Speech Portal, follow these instructions to create an Azure account and subscribe to the Speech service. 此统一订阅使你可以访问语音到文本、文本到语音、语音翻译和自定义语音门户。This unified subscription gives you access to speech-to-text, text-to-speech, speech translation, and the Custom Speech portal.

下载音频文件Download an audio file

按照以下步骤下载包含语音的音频文件,并将其打包为 zip 文件。Follow these steps to download an audio file that contains speech and package it into a zip file.

  1. 从此链接下载示例 wav 文件,方法是:右键单击链接,然后选择“将链接另存为”。Download the sample wav file from this link by right-clicking the link and selecting Save link as. 单击“保存”以下载 whatstheweatherlike.wav 文件。Click Save to download the whatstheweatherlike.wav file.
  2. 使用文件资源管理器或带有 zip 工具的终端窗口,创建一个名为 whatstheweatherlike.zip 的 zip 文件,其中包含所下载的 whatstheweatherlike.wav 文件。Using a file explorer or terminal window with a zip tool, create a zip file named whatstheweatherlike.zip that contains the whatstheweatherlike.wav file you downloaded. 在 Windows 中,你可以打开 Windows 资源管理器,导航到 Downloads 文件夹,右键单击 whatstheweatherliike.wav,单击“发送到”,单击“压缩的文件夹”,然后按 Enter 以接受默认文件名。In Windows, you can open Windows Explorer, navigate to the Downloads folder, right-click whatstheweatherliike.wav, click Send to, click Compressed (zipped) folder, and press enter to accept the default filename.

在自定义语音门户中创建项目Create a project in the Custom Speech portal

按照以下步骤创建一个项目,该项目包含一个音频文件的 zip 文件。Follow these steps to create a project that contains your zip of one audio file.

  1. 打开 Speech Studio,然后单击“新建项目”。Open Speech Studio, and click New project. 键入此项目的名称,然后单击“创建”。Type a name for this project, and click Create. 项目将出现在“自定义语音列表”中。Your project appears in the Custom Speech list.
  2. 单击项目名称。Click the name of your project. 在“数据”选项卡中,单击“上传数据”。In the Data tab, click Upload data.
  3. 语音数据类型默认为“仅音频”,因此单击“下一步”。The speech data type defaults to Audio only, so click Next.
  4. 将新的语音数据集命名为 MyZipOfAudio,然后单击“下一步”。Name your new speech dataset MyZipOfAudio, and click Next.
  5. 单击“浏览文件...”,导航到 whatstheweatherlike.zip 文件,然后单击“打开”。Click Browse files..., navigate to your whatstheweatherlike.zip file, and click Open.
  6. 单击“上传”按钮。Click the Upload button. 浏览器将你的 zip 文件上传到 Speech Studio,Speech Studio 会处理这些内容。The browser uploads your zip file to Speech Studio, and Speech Studio processes the contents.

测试模型Test a model

Speech Studio 处理完 zip 文件的内容后,可以在播放源音频的同时检查听录内容,以查找错误或遗漏。After Speech Studio processes the contents of your zip file, you can play the source audio while examining the transcription to look for errors or omissions. 请按照以下步骤在浏览器中检查听录内容的质量。Follow these steps to examine transcription quality in the browser.

  1. 单击“测试”选项卡,然后单击“添加测试”。Click the Testing tab, and click Add test.
  2. 在此测试中,我们将检查仅音频数据的质量,因此单击“下一步”以接受此测试类型。In this test, we are inspecting quality of audio-only data, so click Next to accept this test type.
  3. 将此测试命名为 MyModelTest,然后单击“下一步”。Name this test MyModelTest, and click Next.
  4. 单击 MyZipOfAudio 左侧的单选按钮,然后单击“下一步”。Click the radio button left of MyZipOfAudio, and click Next.
  5. “模型 1”下拉列表默认为最新的识别模型,因此,请单击“创建”。The Model 1 dropdown defaults to the latest recognition model, so click Create. 处理音频数据集的内容后,测试状态将更改为“成功”。After processing the contents of your audio dataset, the test status will change to Succeeded.
  6. 单击“MyModelTest”。Click MyModelTest. 随即出现语音识别的结果。The results of speech recognition appear. 单击圆圈内的右指三角形即可听到音频,并将你听到的内容与圆圈旁的文本进行比较。Click the right-pointing triangle within the circle to hear the audio, and compare what you hear to the text by the circle.

下载详细报告Download detailed results

你可以下载更详细地介绍听录内容的文件。You can download files that describe transcriptions in in much greater detail. 该文件包括音频文件中语音的词汇形式,以及包含每个字词的偏移量、持续时间和听录内容可信度详细信息的 JSON 文件。The files include lexical form of speech in your audio files, and JSON files that contain offset, duration, and transcription confidence details about each word. 请遵循以下步骤以查看这些文件。Follow these steps to see these files.

  1. 单击“下载”。Click Download.
  2. 在“下载”对话框中,取消选择“音频”,然后单击“下载”。On the Download dialog, unselect Audio, and click Download.
  3. 解压缩已下载的 zip 文件,并检查解压缩后的文件。Unzip the downloaded zip file, and examine the extracted files.

后续步骤Next steps

了解如何通过训练自定义模型来提高语音识别的准确性。Learn about improving the accuracy of speech recognition by training a custom model.