快速入门:在 macOS 上使用语音 SDK 通过 Swift 识别语音Quickstart: Recognize speech in Swift on macOS using the Speech SDK

针对语音合成也提供了快速入门。Quickstarts are also available for speech synthesis.

本文介绍如何使用认知服务语音 SDK 在 Swift 中创建一个 macOS 应用,以便将通过麦克风录制的语音转录为文本。In this article, you learn how to create a macOS app in Swift using the Cognitive Services Speech SDK to transcribe speech recorded from a microphone to text.

先决条件Prerequisites

在开始之前,请满足以下一系列先决条件:Before you get started, here's a list of prerequisites:

获取适用于 macOS 的语音 SDKGet the Speech SDK for macOS

重要

下载任何 Azure 认知服务语音 SDK,即表示你已确认接受其许可条款。By downloading any of the Azure Cognitive Services Speech SDKs, you acknowledge its license. 有关详细信息,请参阅:For more information, see:

适用于 macOS 的认知服务语音 SDK 目前以框架捆绑包的形式分发。The Cognitive Services Speech SDK for macOS is distributed as a framework bundle. 可在 Xcode 项目将它作为 CocoaPod 使用,或者从 https://aka.ms/csspeech/macosbinary 下载,然后手动与它建立链接。It can be used in Xcode projects as a CocoaPod, or downloaded from https://aka.ms/csspeech/macosbinary and linked manually. 本指南使用 CocoaPod。This guide uses a CocoaPod.

创建 Xcode 项目Create an Xcode project

启动 Xcode,然后通过单击“文件” > “新建” > “项目”来启动新项目。Start Xcode, and start a new project by clicking File > New > Project. 在模板选择对话框中,选择“Cocoa 应用”模板。In the template selection dialog, choose the "Cocoa App" template.

在随后的对话框中,进行以下选择:In the dialogs that follow, make the following selections:

  1. 项目选项对话框Project Options Dialog
    1. 为快速入门应用输入一个名称,例如 helloworldEnter a name for the quickstart app, for example helloworld.
    2. 如果已经有 Apple 开发人员帐户,请输入相应的组织名称和组织标识符。Enter an appropriate organization name and an organization identifier, if you already have an Apple developer account. 可以直接选取任意名称(例如 testorg)进行测试。For testing purposes, you can just pick any name like testorg. 若要对应用进行签名,需要适当的预配配置文件。To sign the app, you need a proper provisioning profile. 有关详细信息,请参阅 Apple 开发人员站点Refer to the Apple developer site for details.
    3. 确保选择 Swift 作为项目的语言。Make sure Swift is chosen as the language for the project.
    4. 禁用使用情节提要和创建基于文档的应用程序的复选框。Disable the checkboxes to use storyboards and to create a document-based application. 将以编程方式创建示例应用的简单 UI。The simple UI for the sample app will be created programmatically.
    5. 禁用所有用于测试和核心数据的复选框。Disable all checkboxes for tests and core data.
  2. 选择项目目录Select project directory
    1. 选择用于放置该项目的目录。Choose a directory to put the project in. 这样会在所选目录中创建一个 helloworld 目录,其中包含 Xcode 项目的所有文件。This creates a helloworld directory in the chosen directory that contains all the files for the Xcode project.
    2. 禁止创建适用于此示例项目的 Git 存储库。Disable the creation of a Git repo for this example project.
  3. 设置网络和麦克风访问权限。Set the entitlements for network and microphone access. 单击左侧概述中第一行内的应用名称转到应用配置,然后选择“功能”选项卡。Click the app name in the first line in the overview on the left to get to the app configuration, and then choose the "Capabilities" tab.
    1. 为该应用启用“应用沙盒”设置。Enable the "App sandbox" setting for the app.
    2. 启用“传出连接”和“麦克风”访问权限对应的复选框。Enable the checkboxes for "Outgoing Connections" and "Microphone" access. 沙盒设置Sandbox Settings
  4. 该应用还需要在 Info.plist 文件中声明使用麦克风。The app also needs to declare use of the microphone in the Info.plist file. 单击概述中的文件,然后添加“隐私 - 麦克风使用说明”键,其值类似于“语音识别所需的麦克风”。Click on the file in the overview, and add the "Privacy - Microphone Usage Description" key, with a value like "Microphone is needed for speech recognition". Info.plist 中的设置Settings in Info.plist
  5. 关闭 Xcode 项目。Close the Xcode project. 稍后在设置 CocoaPods 后,将使用该项目的另一个实例。You will use a different instance of it later after setting up the CocoaPods.

添加示例代码Add the sample code

  1. 将名为 MicrosoftCognitiveServicesSpeech-Bridging-Header.h 的新头文件放置到 helloworld 项目内的 helloworld 目录中,并将以下代码粘贴到其中:Place a new header file with the name MicrosoftCognitiveServicesSpeech-Bridging-Header.h into the helloworld directory inside the helloworld project, and paste the following code into it:

    #ifndef MicrosoftCognitiveServicesSpeech_Bridging_Header_h
    #define MicrosoftCognitiveServicesSpeech_Bridging_Header_h
    
    #import <MicrosoftCognitiveServicesSpeech/SPXSpeechAPI.h>
    
    #endif /* MicrosoftCognitiveServicesSpeech_Bridging_Header_h */
    
  2. 在“Objective-C 桥接头文件”字段标头属性中,将桥接头文件的相对路径 helloworld/MicrosoftCognitiveServicesSpeech-Bridging-Header.h 添加到 helloworld 目标的 Swift 项目设置中Add the relative path helloworld/MicrosoftCognitiveServicesSpeech-Bridging-Header.h to the bridging header to the Swift project settings for the helloworld target in the Objective-C Bridging Header field Header properties

  3. 通过以下方式替换自动生成的 AppDelegate.swift 文件的内容:Replace the contents of the autogenerated AppDelegate.swift file by:

    import Cocoa
    
    @NSApplicationMain
    class AppDelegate: NSObject, NSApplicationDelegate {
        var label: NSTextField!
        var fromMicButton: NSButton!
    
        var host: String!
        var sub: String!
    
        @IBOutlet weak var window: NSWindow!
    
        func applicationDidFinishLaunching(_ aNotification: Notification) {
            print("loading")
            // load subscription information
    
            host = "wss://YourServiceRegion.stt.speech.azure.cn/"
            sub = "YourSubscriptionKey"
    
            label = NSTextField(frame: NSRect(x: 100, y: 50, width: 200, height: 200))
            label.textColor = NSColor.black
            label.lineBreakMode = .byWordWrapping
    
            label.stringValue = "Recognition Result"
            label.isEditable = false
    
            self.window.contentView?.addSubview(label)
    
            fromMicButton = NSButton(frame: NSRect(x: 100, y: 300, width: 200, height: 30))
            fromMicButton.title = "Recognize"
            fromMicButton.target = self
            fromMicButton.action = #selector(fromMicButtonClicked)
            self.window.contentView?.addSubview(fromMicButton)
        }
    
        @objc func fromMicButtonClicked() {
            DispatchQueue.global(qos: .userInitiated).async {
                self.recognizeFromMic()
            }
        }
    
        func recognizeFromMic() {
            var speechConfig: SPXSpeechConfiguration?
            do {
                try speechConfig = SPXSpeechConfiguration(host: host, subscription: sub)
            } catch {
                print("error \(error) happened")
                speechConfig = nil
            }
            speechConfig?.speechRecognitionLanguage = "en-US"
            let audioConfig = SPXAudioConfiguration()
    
            let reco = try! SPXSpeechRecognizer(speechConfiguration: speechConfig!, audioConfiguration: audioConfig)
    
            reco.addRecognizingEventHandler() {reco, evt in
                print("intermediate recognition result: \(evt.result.text ?? "(no result)")")
                self.updateLabel(text: evt.result.text, color: .gray)
            }
    
            updateLabel(text: "Listening ...", color: .gray)
            print("Listening...")
    
            let result = try! reco.recognizeOnce()
            print("recognition result: \(result.text ?? "(no result)"), reason: \(result.reason.rawValue)")
            updateLabel(text: result.text, color: .black)
    
            if result.reason != SPXResultReason.recognizedSpeech {
                let cancellationDetails = try! SPXCancellationDetails(fromCanceledRecognitionResult: result)
                print("cancelled: \(result.reason), \(cancellationDetails.errorDetails)")
                updateLabel(text: "Error: \(cancellationDetails.errorDetails)", color: .red)
            }
        }
    
        func updateLabel(text: String?, color: NSColor) {
            DispatchQueue.main.async {
                self.label.stringValue = text!
                self.label.textColor = color
            }
        }
    }
    
  4. AppDelegate.swift 中,将字符串 YourSubscriptionKey 替换为你的订阅密钥。In AppDelegate.swift, replace the string YourSubscriptionKey with your subscription key.

  5. 将字符串 YourServiceRegion 替换为与订阅关联的区域(例如,对于试用订阅,为 chinaeast2)。Replace the string YourServiceRegion with the region associated with your subscription (for example, chinaeast2 for the trial subscription).

安装用作 CocoaPod 的 SDKInstall the SDK as a CocoaPod

  1. 根据安装说明中所述,安装 CocoaPod 依赖项管理器。Install the CocoaPod dependency manager as described in its installation instructions.

  2. 导航到示例应用所在的目录 (helloworld)。Navigate to the directory of your sample app (helloworld). 在该目录中添加一个包含以下内容的名为 Podfile 的文本文件:Place a text file with the name Podfile and the following content in that directory:

    target 'helloworld' do
        platform :osx, 10.14
        pod 'MicrosoftCognitiveServicesSpeech-macOS', '~> 1.6'
        use_frameworks!
    end
    
  3. 在终端中导航到 helloworld 目录并运行命令 pod installNavigate to the helloworld directory in a terminal and run the command pod install. 这会生成一个 helloworld.xcworkspace Xcode 工作区,其中包含示例应用以及用作依赖项的语音 SDK。This will generate a helloworld.xcworkspace Xcode workspace containing both the sample app and the Speech SDK as a dependency. 在后续步骤中将使用此工作区。This workspace will be used in the following.

生成并运行示例Build and run the sample

  1. 在 Xcode 中打开 helloworld.xcworkspace 工作区。Open the helloworld.xcworkspace workspace in Xcode.
  2. 使调试输出可见(“视图” > “调试区域” > “激活控制台”)。Make the debug output visible (View > Debug Area > Activate Console).
  3. 在菜单中选择“产品” > “运行”,或者单击“播放”按钮,以生成并运行示例代码。Build and run the example code by selecting Product > Run from the menu or clicking the Play button.
  4. 单击应用中的“识别”按钮并讲几句话后,应会在应用窗口的下部看到讲出的文本。After you click the "Recognize" button in the app and say a few words, you should see the text you have spoken in the lower part of the app window.

后续步骤Next steps