快速入门:使用读取客户端库或 REST APIQuickstart: Use the Read client library or REST API

开始使用读取 REST API 或客户端库。Get started with the Read REST API or client libraries. 读取服务提供 AI 算法,用于从图像中提取可见文本,并将其作为结构化字符串返回。The Read service provides you with AI algorithms for extracting visible text from images and returning it as structured strings. 请按照以下步骤将包安装到应用程序中并试用基本任务的示例代码。Follow these steps to install a package to your application and try out the sample code for basic tasks.

使用 OCR 客户端库从图像中读取印刷体文本和手写文本。Use the OCR client library to read printed and handwritten text from an image.

参考文档 | 库源代码 | 包 (NuGet) | 示例Reference documentation | Library source code | Package (NuGet) | Samples

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • Visual Studio IDE 或最新版本的 .NET CoreThe Visual Studio IDE or current version of .NET Core.
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

新建 C# 应用程序Create a new C# application

使用 Visual Studio 创建新的 .NET Core 应用程序。Using Visual Studio, create a new .NET Core application.

安装客户端库Install the client library

创建新项目后,右键单击“解决方案资源管理器”中的项目解决方案,然后选择“管理 NuGet 包”,以安装客户端库 。Once you've created a new project, install the client library by right-clicking on the project solution in the Solution Explorer and selecting Manage NuGet Packages. 在打开的包管理器中,选择“浏览”,选中“包括预发行版”并搜索 Microsoft.Azure.CognitiveServices.Vision.ComputerVisionIn the package manager that opens select Browse, check Include prerelease, and search for Microsoft.Azure.CognitiveServices.Vision.ComputerVision. 选择版本 6.0.0-preview.1,然后选择“安装”。Select version 6.0.0-preview.1, and then Install.

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

在首选的编辑器或 IDE 中,从项目目录打开 Program.cs 文件。From the project directory, open the Program.cs file in your preferred editor or IDE.

查找订阅密钥和终结点Find the subscription key and endpoint

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的计算机视觉资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到订阅密钥和终结点 。You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

在应用程序的“Program”类中,为计算机视觉的订阅密钥和终结点创建变量。In the application's Program class, create variables for your Computer Vision subscription key and endpoint. 将你的订阅密钥和终结点粘贴到以下代码中的指定位置。Paste your subscription key and endpoint into the following code where indicated. 计算机视觉终结点的格式为 https://<your_computer_vision_resource_name>.cognitiveservices.azure.cn/Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.cn/.

using System;
using System.Collections.Generic;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System.Threading.Tasks;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System.Threading;
using System.Linq;

namespace ComputerVisionQuickstart
{
    class Program
    {
        // Add your Computer Vision subscription key and endpoint
        static string subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
        static string endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";

重要

记住在完成后将订阅密钥从代码中删除,永远不要公开发布它。Remember to remove the subscription key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

在应用程序的 Main 方法中,添加对本快速入门中使用的方法的调用。In the application's Main method, add calls for the methods used in this quickstart. 稍后将创建这些内容。You will create these later.

// Create a client
ComputerVisionClient client = Authenticate(endpoint, subscriptionKey);
// Extract text (OCR) from a URL image using the Read API
ReadFileUrl(client, READ_TEXT_URL_IMAGE).Wait();

对象模型Object model

以下类和接口用于处理 OCR .NET SDK 的某些主要功能。The following classes and interfaces handle some of the major features of the OCR .NET SDK.

名称Name 说明Description
ComputerVisionClientComputerVisionClient 所有计算机视觉功能都需要此类。This class is needed for all Computer Vision functionality. 可以使用订阅信息实例化此类,然后使用它来执行大多数图像操作。You instantiate it with your subscription information, and you use it to do most image operations.
ComputerVisionClientExtensionsComputerVisionClientExtensions 此类包含 ComputerVisionClient 的其他方法。This class contains additional methods for the ComputerVisionClient.

代码示例Code examples

这些代码片段演示如何使用适用于 .NET 的 OCR 客户端库执行以下任务:These code snippets show you how to do the following tasks with the OCR client library for .NET:

验证客户端Authenticate the client

在 Program 类的新方法中,使用终结点和订阅密钥实例化客户端。In a new method in the Program class, instantiate a client with your endpoint and subscription key. 使用订阅密钥创建一个 ApiKeyServiceClientCredentials 对象,并在终结点中使用该对象创建一个 ComputerVisionClient 对象。Create a ApiKeyServiceClientCredentials object with your subscription key, and use it with your endpoint to create a ComputerVisionClient object.

/*
 * AUTHENTICATE
 * Creates a Computer Vision client used by each example.
 */
public static ComputerVisionClient Authenticate(string endpoint, string key)
{
    ComputerVisionClient client =
      new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
      { Endpoint = endpoint };
    return client;
}

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以读取图像中的可见文本,并将其转换为字符流。The OCR service can read visible text in an image and convert it to a character stream. 本部分中的代码使用适用于 Read 3.0 的最新计算机视觉 SDK 版本,并定义了 BatchReadFileUrl 方法,该方法使用客户端对象来检测和提取图像中的文本。The code in this section uses the latest Computer Vision SDK release for Read 3.0 and defines a method, BatchReadFileUrl, which uses the client object to detect and extract text in the image.

提示

还可以从本地图像提取文本。You can also extract text from a local image. 请参阅 ComputerVisionClient 方法,例如 ReadInStreamAsync。See the ComputerVisionClient methods, such as ReadInStreamAsync. 或者,请参阅 GitHub 上的示例代码,了解涉及本地图像的方案。Or, see the sample code on GitHub for scenarios involving local images.

设置测试图像Set up test image

在 Program 类中,保存要从中提取文本的图像的 URL 的引用。In your Program class, save a reference to the URL of the image you want to extract text from. 此代码段包含打印文本和手写文本的示例图像。This snippet includes sample images for both printed and handwritten text.

private const string READ_TEXT_URL_IMAGE = "https://intelligentkioskstore.blob.core.windows.net/visionapi/suggestedphotos/3.png";

调用读取 APICall the Read API

定义用于读取文本的新方法。Define the new method for reading text. 添加以下代码,该代码对给定图像调用 ReadAsync 方法。Add the code below, which calls the ReadAsync method for the given image. 这会返回一个操作 ID 并启动异步进程来读取图像的内容。This returns an operation ID and starts an asynchronous process to read the content of the image.

/*
 * READ FILE - URL 
 * Extracts text. 
 */
public static async Task ReadFileUrl(ComputerVisionClient client, string urlFile)
{
    Console.WriteLine("----------------------------------------------------------");
    Console.WriteLine("READ FILE FROM URL");
    Console.WriteLine();

    // Read text from URL
    var textHeaders = await client.ReadAsync(urlFile, language: "en");
    // After the request, get the operation location (operation ID)
    string operationLocation = textHeaders.OperationLocation;
    Thread.Sleep(2000);

获取读取结果Get Read results

接下来,获取从 ReadAsync 调用返回的操作 ID,并使用它查询服务以获取操作结果。Next, get the operation ID returned from the ReadAsync call, and use it to query the service for operation results. 下面的代码检查操作,直到返回结果。The following code checks the operation until the results are returned. 然后,它将提取的文本数据输出到控制台。It then prints the extracted text data to the console.

// Retrieve the URI where the extracted text will be stored from the Operation-Location header.
// We only need the ID and not the full URL
const int numberOfCharsInOperationId = 36;
string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

// Extract the text
ReadOperationResult results;
Console.WriteLine($"Extracting text from URL file {Path.GetFileName(urlFile)}...");
Console.WriteLine();
do
{
    results = await client.GetReadResultAsync(Guid.Parse(operationId));
}
while ((results.Status == OperationStatusCodes.Running ||
    results.Status == OperationStatusCodes.NotStarted));

显示读取结果Display Read results

添加以下代码来分析和显示检索到的文本数据,并完成方法定义。Add the following code to parse and display the retrieved text data, and finish the method definition.

// Display the found text.
    Console.WriteLine();
    var textUrlFileResults = results.AnalyzeResult.ReadResults;
    foreach (ReadResult page in textUrlFileResults)
    {
        foreach (Line line in page.Lines)
        {
            Console.WriteLine(line.Text);
        }
    }
    Console.WriteLine();
}

运行应用程序Run the application

单击 IDE 窗口顶部的“调试”按钮,运行应用程序。Run the application by clicking the Debug button at the top of the IDE window.

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps

  • 可以在 GitHub 上找到此示例的源代码。The source code for this sample can be found on GitHub.

使用光学字符识别客户端库,通过读取 API 来读取印刷体文本和手写文本。Use the Optical character recognition client library to read printed and handwritten text with the Read API.

参考文档 | 库源代码 | 包 (PiPy) | 示例Reference documentation | Library source code | Package (PiPy) | Samples

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial

  • Python 3.xPython 3.x

    • 你的 Python 安装应包含 pipYour Python installation should include pip. 可以通过在命令行上运行 pip --version 来检查是否安装了 pip。You can check if you have pip installed by running pip --version on the command line. 通过安装最新版本的 Python 获取 pip。Get pip by installing the latest version of Python.
  • 拥有 Azure 订阅后,请在 Azure 门户中创建计算机视觉资源 ,以获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.

    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

安装客户端库Install the client library

可使用以下方式安装客户端库:You can install the client library with:

pip install --upgrade azure-cognitiveservices-vision-computervision

同时,安装 Pillow 库。Also install the Pillow library.

pip install pillow

创建新的 Python 应用程序Create a new Python application

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

创建新的 Python 文件 —,例如 quickstart-file.py。Create a new Python file—quickstart-file.py, for example. 然后在你喜欢的编辑器或 IDE 中打开该文件。Then open it in your preferred editor or IDE.

查找订阅密钥和终结点Find the subscription key and endpoint

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的计算机视觉资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到订阅密钥和终结点 。You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

为计算机视觉订阅密钥和终结点创建变量。Create variables for your Computer Vision subscription key and endpoint. 将你的订阅密钥和终结点粘贴到以下代码中的指定位置。Paste your subscription key and endpoint into the following code where indicated. 计算机视觉终结点的格式为 https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

from array import array
import os
from PIL import Image
import sys
import time

'''
Authenticate
Authenticates your credentials and creates a client.
'''
subscription_key = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE"
endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"

重要

记住在完成后将订阅密钥从代码中删除,永远不要公开发布它。Remember to remove the subscription key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

对象模型Object model

以下类和接口用于处理 OCR Python SDK 的某些主要功能。The following classes and interfaces handle some of the major features of the OCR Python SDK.

名称Name 说明Description
ComputerVisionClientOperationsMixinComputerVisionClientOperationsMixin 此类直接处理所有图像操作,例如图像分析、文本检测和缩略图生成。This class directly handles all of the image operations, such as image analysis, text detection, and thumbnail generation.
ComputerVisionClientComputerVisionClient 所有计算机视觉功能都需要此类。This class is needed for all Computer Vision functionality. 请使用你的订阅信息实例化此类,然后使用它来生成其他类的实例。You instantiate it with your subscription information, and you use it to produce instances of other classes. 它实现了 ComputerVisionClientOperationsMixinIt implements ComputerVisionClientOperationsMixin.
VisualFeatureTypesVisualFeatureTypes 此枚举定义可在标准分析操作中执行的不同类型的图像分析。This enum defines the different types of image analysis that can be done in a standard Analyze operation. 请根据需求指定一组 VisualFeatureTypes 值。You specify a set of VisualFeatureTypes values depending on your needs.

代码示例Code examples

这些代码片段演示如何使用适用于 Python 的 OCR 客户端库执行以下任务:These code snippets show you how to do the following tasks with the OCR client library for Python:

验证客户端Authenticate the client

使用终结点和密钥实例化某个客户端。Instantiate a client with your endpoint and key. 使用密钥创建 CognitiveServicesCredentials 对象,然后在终结点上使用该对象创建 ComputerVisionClient 对象。Create a CognitiveServicesCredentials object with your key, and use it with your endpoint to create a ComputerVisionClient object.

computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以读取图像中的可见文本,并将其转换为字符流。The OCR service can read visible text in an image and convert it to a character stream. 分两部分来执行此操作。You do this in two parts.

调用读取 APICall the Read API

首先,使用以下代码对给定图像调用 read 方法。First, use the following code to call the read method for the given image. 这会返回一个操作 ID 并启动异步进程来读取图像的内容。This returns an operation ID and starts an asynchronous process to read the content of the image.

'''
Batch Read File, recognize handwritten text - remote
This example will extract handwritten text in an image, then print results, line by line.
This API call can also recognize handwriting (not shown).
'''
print("===== Batch Read File - remote =====")
# Get an image with handwritten text
remote_image_handw_text_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg"

# Call API with URL and raw response (allows you to get the operation location)
recognize_handw_results = computervision_client.read(remote_image_handw_text_url,  raw=True)

提示

还可以从本地图像读取文本。You can also read text from a local image. 请参阅 ComputerVisionClientOperationsMixin 方法,如 read_in_stream。See the ComputerVisionClientOperationsMixin methods, such as read_in_stream. 或者,请参阅 GitHub 上的示例代码,了解涉及本地图像的方案。Or, see the sample code on GitHub for scenarios involving local images.

获取读取结果Get Read results

接下来,获取从 read 调用返回的操作 ID,并使用它查询服务以获取操作结果。Next, get the operation ID returned from the read call, and use it to query the service for operation results. 下面的代码每隔一秒钟检查一次操作,直到返回结果。The following code checks the operation at one-second intervals until the results are returned. 然后,它将提取的文本数据输出到控制台。It then prints the extracted text data to the console.

# Get the operation location (URL with an ID at the end) from the response
operation_location_remote = recognize_handw_results.headers["Operation-Location"]
# Grab the ID from the URL
operation_id = operation_location_remote.split("/")[-1]

# Call the "GET" API and wait for it to retrieve the results 
while True:
    get_handw_text_results = computervision_client.get_read_result(operation_id)
    if get_handw_text_results.status not in ['notStarted', 'running']:
        break
    time.sleep(1)

# Print the detected text, line by line
if get_handw_text_results.status == OperationStatusCodes.succeeded:
    for text_result in get_handw_text_results.analyze_result.read_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()

运行应用程序Run the application

在快速入门文件中使用 python 命令运行应用程序。Run the application with the python command on your quickstart file.

python quickstart-file.py

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps

本快速入门介绍了如何使用适用于 Python 的 OCR 库来执行基本任务。In this quickstart, you learned how to use the OCR library for Python to do basis tasks. 接下来,请在参考文档中详细了解该库。Next, explore the reference documentation to learn more about the library.

  • 可以在 GitHub 上找到此示例的源代码。The source code for this sample can be found on GitHub.

使用光学字符识别客户端库读取图像中的印刷体文本和手写文本。Use the Optical character recognition client library to read printed and handwritten text in images.

参考文档 | 库源代码 |项目 (Maven) | 示例Reference documentation | Library source code |Artifact (Maven) | Samples

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 最新版的 Java 开发工具包 (JDK)The current version of the Java Development Kit (JDK)
  • Gradle 生成工具,或其他依赖项管理器。The Gradle build tool, or another dependency manager.
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

创建新的 Gradle 项目Create a new Gradle project

在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

从工作目录运行 gradle init 命令。Run the gradle init command from your working directory. 此命令将创建 Gradle 的基本生成文件,包括 build.gradle.kts,在运行时将使用该文件创建并配置应用程序。This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

gradle init --type basic

当提示你选择一个 DSL 时,选择 KotlinWhen prompted to choose a DSL, select Kotlin.

安装客户端库Install the client library

本快速入门使用 Gradle 依赖项管理器。This quickstart uses the Gradle dependency manager. 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。You can find the client library and information for other dependency managers on the Maven Central Repository.

找到 build.gradle.kts,并使用喜好的 IDE 或文本编辑器将其打开。Locate build.gradle.kts and open it with your preferred IDE or text editor. 然后在该文件中复制以下生成配置。Then copy in the following build configuration. 此配置将项目定义一个 Java 应用程序,其入口点为 ComputerVisionQuickstarts 类。This configuration defines the project as a Java application whose entry point is the class ComputerVisionQuickstarts. 它将导入计算机视觉库。It imports the Computer Vision library.

plugins {
    java
    application
}
application { 
    mainClassName = "ComputerVisionQuickstarts"
}
repositories {
    mavenCentral()
}
dependencies {
    compile(group = "com.microsoft.azure.cognitiveservices", name = "azure-cognitiveservices-computervision", version = "1.0.4-beta")
}

创建 Java 文件Create a Java file

从工作目录运行以下命令,以创建项目源文件夹:From your working directory, run the following command to create a project source folder:

mkdir -p src/main/java

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

导航到新文件夹,创建名为 ComputerVisionQuickstarts.java 的文件。Navigate to the new folder and create a file called ComputerVisionQuickstarts.java. 在你喜欢的编辑器或 IDE 中打开该文件。Open it in your preferred editor or IDE.

查找订阅密钥和终结点Find the subscription key and endpoint

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的计算机视觉资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到订阅密钥和终结点 。You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

定义类 ComputerVisionQuickstarts。Define the class ComputerVisionQuickstarts. 为计算机视觉订阅密钥和终结点创建变量。Create variables for your Computer Vision subscription key and endpoint. 将你的订阅密钥和终结点粘贴到以下代码中的指定位置。Paste your subscription key and endpoint into the following code where indicated. 计算机视觉终结点的格式为 https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

import com.microsoft.azure.cognitiveservices.vision.computervision.*;
import com.microsoft.azure.cognitiveservices.vision.computervision.implementation.ComputerVisionImpl;
import com.microsoft.azure.cognitiveservices.vision.computervision.models.*;

import java.io.*;
import java.nio.file.Files;

import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

public class ComputerVisionQuickstart {

    static String subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
    static String endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";

重要

记住在完成后将订阅密钥从代码中删除,永远不要公开发布它。Remember to remove the subscription key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

在应用程序的 main 方法中,添加对本快速入门中使用的方法的调用。In the application's main method, add calls for the methods used in this quickstart. 稍后将对这些调用进行定义。You'll define these later.

public static void main(String[] args) {
    
    System.out.println("\nAzure Cognitive Services Computer Vision - Java Quickstart Sample");
// Create an authenticated Computer Vision client.
ComputerVisionClient compVisClient = Authenticate(subscriptionKey, endpoint);
// Read from local file
ReadFromFile(compVisClient);
}

对象模型Object model

以下类和接口用于处理 OCR Java SDK 的某些主要功能。The following classes and interfaces handle some of the major features of the OCR Java SDK.

名称Name 说明Description
ComputerVisionClientComputerVisionClient 所有计算机视觉功能都需要此类。This class is needed for all Computer Vision functionality. 请使用你的订阅信息实例化此类,然后使用它来生成其他类的实例。You instantiate it with your subscription information, and you use it to produce instances of other classes.

代码示例Code examples

这些代码片段演示如何使用适用于 Java 的 OCR 客户端库执行以下任务:These code snippets show you how to do the following tasks with the OCR client library for Java:

验证客户端Authenticate the client

在新方法中,使用终结点和密钥实例化客户端 ComputerVisionClient 对象。In a new method, instantiate a ComputerVisionClient object with your endpoint and key.

public static ComputerVisionClient Authenticate(String subscriptionKey, String endpoint){
    return ComputerVisionManager.authenticate(subscriptionKey).withEndpoint(endpoint);
}

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以读取图像中的可见文本,并将其转换为字符流。The OCR service can read visible text in an image and convert it to a character stream. 本部分定义方法 ReadFromFile,该方法采用本地文件路径并将图像的文本输出到控制台。This section defines a method, ReadFromFile, that takes a local file path and prints the image's text to the console.

提示

还可以读取 URL 引用的远程图像中的文本。You can also read text in a remote image referenced by URL. 请参阅 ComputerVision 方法,例如 read。See the ComputerVision methods, such as read. 或者,请参阅 GitHub 上的示例代码以了解涉及远程图像的方案。Or, see the sample code on GitHub for scenarios involving remote images.

设置测试图像Set up test image

在项目的 src/main/ 文件夹中创建 resources/ 文件夹,并添加要从中读取文本的图像 。Create a resources/ folder in the src/main/ folder of your project, and add an image you'd like to read text from. 可下载示例映像在此使用。You can download a sample image to use here.

然后,将以下方法定义添加到 ComputerVisionQuickstarts 类。Then add the following method definition to your ComputerVisionQuickstarts class. 请更改 localFilePath 的值,使之与图像文件相匹配。Change the value of the localFilePath to match your image file.

/**
 * READ : Performs a Read Operation on a local image
 * @param client instantiated vision client
 * @param localFilePath local file path from which to perform the read operation against
 */
private static void ReadFromFile(ComputerVisionClient client) {
    System.out.println("-----------------------------------------------");
    
    String localFilePath = "src\\main\\resources\\myImage.png";
    System.out.println("Read with local file: " + localFilePath);

调用读取 APICall the Read API

然后,添加以下代码调用给定图像的 readInStreamWithServiceResponseAsync 方法。Then, add the following code to call the readInStreamWithServiceResponseAsync method for the given image.

try {
    File rawImage = new File(localFilePath);
    byte[] localImageBytes = Files.readAllBytes(rawImage.toPath());

    // Cast Computer Vision to its implementation to expose the required methods
    ComputerVisionImpl vision = (ComputerVisionImpl) client.computerVision();

    // Read in remote image and response header
    ReadInStreamHeaders responseHeader =
            vision.readInStreamWithServiceResponseAsync(localImageBytes, OcrDetectionLanguage.FR)
                .toBlocking()
                .single()
                .headers();

下面的代码块从 Read 调用的响应中提取操作 ID。The following block of code extracts the operation ID from the response of the Read call. 它将此 ID 与 helper 方法一起使用,以将文本读取结果输出到控制台。It uses this ID with a helper method to print the text read results to the console.

// Extract the operationLocation from the response header
String operationLocation = responseHeader.operationLocation();
System.out.println("Operation Location:" + operationLocation);

getAndPrintReadResult(vision, operationLocation);

结束 try/catch 块和方法定义。Close out the try/catch block and the method definition.

} catch (Exception e) {
        System.out.println(e.getMessage());
        e.printStackTrace();
    }
}

获取读取结果Get Read results

然后,添加 helper 方法的定义。Then, add a definition for the helper method. 此方法使用上一步中的操作 ID 来查询读取操作并在 OCR 结果可用时获取该结果。This method uses the operation ID from the previous step to query the read operation and get OCR results when they're available.

/**
 * Polls for Read result and prints results to console
 * @param vision Computer Vision instance
 * @return operationLocation returned in the POST Read response header
 */
private static void getAndPrintReadResult(ComputerVision vision, String operationLocation) throws InterruptedException {
    System.out.println("Polling for Read results ...");

    // Extract OperationId from Operation Location
    String operationId = extractOperationIdFromOpLocation(operationLocation);

    boolean pollForResult = true;
    ReadOperationResult readResults = null;

    while (pollForResult) {
        // Poll for result every second
        Thread.sleep(1000);
        readResults = vision.getReadResult(UUID.fromString(operationId));

        // The results will no longer be null when the service has finished processing the request.
        if (readResults != null) {
            // Get request status
            OperationStatusCodes status = readResults.status();

            if (status == OperationStatusCodes.FAILED || status == OperationStatusCodes.SUCCEEDED) {
                pollForResult = false;
            }
        }
    }

此方法的其余部分会分析 OCR 结果,并将其输出到控制台。The rest of the method parses the OCR results and prints them to the console.

// Print read results, page per page
    for (ReadResult pageResult : readResults.analyzeResult().readResults()) {
        System.out.println("");
        System.out.println("Printing Read results for page " + pageResult.page());
        StringBuilder builder = new StringBuilder();

        for (Line line : pageResult.lines()) {
            builder.append(line.text());
            builder.append("\n");
        }

        System.out.println(builder.toString());
    }
}

最后,添加上面使用的其他 helper 方法,该方法从初始响应中提取操作 ID。Finally, add the other helper method used above, which extracts the operation ID from the initial response.

/**
 * Extracts the OperationId from a Operation-Location returned by the POST Read operation
 * @param operationLocation
 * @return operationId
 */
private static String extractOperationIdFromOpLocation(String operationLocation) {
    if (operationLocation != null && !operationLocation.isEmpty()) {
        String[] splits = operationLocation.split("/");

        if (splits != null && splits.length > 0) {
            return splits[splits.length - 1];
        }
    }
    throw new IllegalStateException("Something went wrong: Couldn't extract the operation id from the operation location");
}

运行应用程序Run the application

可使用以下命令生成应用:You can build the app with:

gradle build

使用 gradle run 命令运行应用程序:Run the application with the gradle run command:

gradle run

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps

通过本快速入门,你已了解如何使用 OCR Java 库执行基本任务。In this quickstart, you learned how to use the OCR Java library to do basis tasks. 接下来,请在参考文档中详细了解该库。Next, explore the reference documentation to learn more about the library.

  • 可以在 GitHub 上找到此示例的源代码。The source code for this sample can be found on GitHub.

使用光学字符识别客户端库,通过读取 API 来读取印刷体文本和手写文本。Use the Optical character recognition client library to read printed and handwritten text with the Read API.

参考文档 | 库源代码 | 包 (npm) | 示例Reference documentation | Library source code | Package (npm) | Samples

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 最新版本的 Node.jsThe current version of Node.js
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

创建新的 Node.js 应用程序Create a new Node.js application

在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

运行 npm init 命令以使用 package.json 文件创建一个 node 应用程序。Run the npm init command to create a node application with a package.json file.

npm init

安装客户端库Install the client library

安装 ms-rest-azure@azure/cognitiveservices-computervision NPM 包:Install the ms-rest-azure and @azure/cognitiveservices-computervision NPM package:

npm install @azure/cognitiveservices-computervision

同时安装异步模块:Also install the async module:

npm install async

应用的 package.json 文件将使用依赖项进行更新。Your app's package.json file will be updated with the dependencies.

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

创建新文件 index.js,将其在文本编辑器中打开。Create a new file, index.js, and open it in a text editor.

查找订阅密钥和终结点Find the subscription key and endpoint

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的计算机视觉资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到订阅密钥和终结点 。You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

为计算机视觉订阅密钥和终结点创建变量。Create variables for your Computer Vision subscription key and endpoint. 将你的订阅密钥和终结点粘贴到以下代码中的指定位置。Paste your subscription key and endpoint into the following code where indicated. 计算机视觉终结点的格式为 https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

'use strict';

const async = require('async');
const fs = require('fs');
const https = require('https');
const path = require("path");
const createReadStream = require('fs').createReadStream
const sleep = require('util').promisify(setTimeout);
const ComputerVisionClient = require('@azure/cognitiveservices-computervision').ComputerVisionClient;
const ApiKeyCredentials = require('@azure/ms-rest-js').ApiKeyCredentials;

/**
 * AUTHENTICATE
 * This single client is used for all examples.
 */
const key = 'PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE';
const endpoint = 'PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE';

重要

记住在完成后将订阅密钥从代码中删除,永远不要公开发布它。Remember to remove the subscription key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

对象模型Object model

以下类和接口用于处理 OCR Node.js SDK 的某些主要功能。The following classes and interfaces handle some of the major features of the OCR Node.js SDK.

名称Name 说明Description
ComputerVisionClientComputerVisionClient 所有计算机视觉功能都需要此类。This class is needed for all Computer Vision functionality. 可以使用订阅信息实例化此类,然后使用它来执行大多数图像操作。You instantiate it with your subscription information, and you use it to do most image operations.

代码示例Code examples

这些代码片段演示如何使用适用于 Node.js 的 OCR 客户端库执行以下任务:These code snippets show you how to do the following tasks with the OCR client library for Node.js:

验证客户端Authenticate the client

使用终结点和密钥实例化某个客户端。Instantiate a client with your endpoint and key. 使用密钥和终结点创建 ApiKeyCredentials 对象,然后使用它创建 ComputerVisionClient 对象。Create a ApiKeyCredentials object with your key and endpoint, and use it to create a ComputerVisionClient object.

const computerVisionClient = new ComputerVisionClient(
  new ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': key } }), endpoint);

然后定义函数 computerVision,并声明一个包含主函数和回调函数的异步系列。Then, define a function computerVision and declare an async series with primary function and callback function. 我们会将快速入门代码添加到主函数中,并调用脚本底部的 computerVisionYou will add your quickstart code into the primary function, and call computerVision at the bottom of the script. 此快速入门中的其余代码位于 computerVision 函数内部。The rest of the code in this quickstart goes inside the computerVision function.

function computerVision() {
  async.series([
    async function () {
},
    function () {
      return new Promise((resolve) => {
        resolve();
      })
    }
  ], (err) => {
    throw (err);
  });
}

computerVision();

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以提取图像中的可见文本,并将其转换为字符流。The OCR service can extract the visible text in an image and convert it to a character stream. 此示例使用读取操作。This sample uses the Read operations.

设置测试图像Set up test images

保存要从中提取文本的图像的 URL 的引用。Save a reference of the URL of the images you want to extract text from.

// URL images containing printed and/or handwritten text. 
// The URL can point to image files (.jpg/.png/.bmp) or multi-page files (.pdf, .tiff).
const printedTextSampleURL = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample2.jpg';
const multiLingualTextURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/MultiLingual.png';
const mixedMultiPagePDFURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/MultiPageHandwrittenForm.pdf';

备注

还可以从本地图像读取文本。You can also read text from a local image. 请参阅 ComputerVisionClient 方法,例如 readInStream。See the ComputerVisionClient methods, such as readInStream. 或者,请参阅 GitHub 上的示例代码,了解涉及本地图像的方案。Or, see the sample code on GitHub for scenarios involving local images.

调用读取 APICall the Read API

在函数中定义以下字段以表示读取调用状态值。Define the following fields in your function to denote the Read call status values.

// Status strings returned from Read API. NOTE: CASING IS SIGNIFICANT.
// Before Read 3.0, these are "Succeeded" and "Failed"
const STATUS_SUCCEEDED = "succeeded";
const STATUS_FAILED = "failed"

添加以下代码,该代码针对给定图像调用 readTextFromURL 函数。Add the code below, which calls the readTextFromURL function for the given images.

// Recognize text in printed image from a URL
console.log('Read printed text from URL...', printedTextSampleURL.split('/').pop());
const printedResult = await readTextFromURL(computerVisionClient, printedTextSampleURL);
printRecText(printedResult);

// Recognize multi-lingual text in a PNG from a URL
console.log('\nRead printed multi-lingual text in a PNG from URL...', multiLingualTextURL.split('/').pop());
const multiLingualResult = await readTextFromURL(computerVisionClient, multiLingualTextURL);
printRecText(multiLingualResult);

// Recognize printed text and handwritten text in a PDF from a URL
console.log('\nRead printed and handwritten text from a PDF from URL...', mixedMultiPagePDFURL.split('/').pop());
const mixedPdfResult = await readTextFromURL(computerVisionClient, mixedMultiPagePDFURL);
printRecText(mixedPdfResult);

定义 readTextFromURL 函数。Define the readTextFromURL function. 这会在客户端对象上调用 read 方法,该方法返回一个操作 ID 并启动异步进程来读取图像的内容。This calls the read method on the client object, which returns an operation ID and starts an asynchronous process to read the content of the image. 然后它使用操作 ID 来检查操作状态,直到返回结果。Then it uses the operation ID to check the operation status until the results are returned. 然后它会返回提取的结果。They it returns the extracted results.

// Perform read and await the result from URL
async function readTextFromURL(client, url) {
  // To recognize text in a local image, replace client.read() with readTextInStream() as shown:
  let result = await client.read(url);
  // Operation ID is last path segment of operationLocation (a URL)
  let operation = result.operationLocation.split('/').slice(-1)[0];

  // Wait for read recognition to complete
  // result.status is initially undefined, since it's the result of read
  while (result.status !== STATUS_SUCCEEDED) { await sleep(1000); result = await client.getReadResult(operation); }
  return result.analyzeResult.readResults; // Return the first page of result. Replace [0] with the desired page if this is a multi-page file such as .pdf or .tiff.
}

然后定义帮助程序函数 printRecText,该函数将读取操作的结果输出到控制台。Then, define the helper function printRecText, which prints the results of the Read operations to the console.

// Prints all text from Read result
function printRecText(readResults) {
  console.log('Recognized text:');
  for (const page in readResults) {
    if (readResults.length > 1) {
      console.log(`==== Page: ${page}`);
    }
    const result = readResults[page];
    if (result.lines.length) {
      for (const line of result.lines) {
        console.log(line.words.map(w => w.text).join(' '));
      }
    }
    else { console.log('No recognized text.'); }
  }
}

运行应用程序Run the application

在快速入门文件中使用 node 命令运行应用程序。Run the application with the node command on your quickstart file.

node index.js

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps

  • 可以在 GitHub 上找到此示例的源代码。The source code for this sample can be found on GitHub.

使用 OCR 客户端库从图像中读取印刷体文本和手写文本。Use the OCR client library to read printed and handwritten text from images.

参考文档 | 库源代码 | Reference documentation | Library source code | Package

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 最新版本的 GoThe latest version of Go
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

创建 Go 项目目录Create a Go project directory

在控制台窗口(cmd、PowerShell、终端、Bash)中,为 Go 项目创建一个名为 my-app 的新工作区并导航到该工作区。In a console window (cmd, PowerShell, Terminal, Bash), create a new workspace for your Go project, named my-app, and navigate to it.

mkdir -p my-app/{src, bin, pkg}  
cd my-app

工作区包含三个文件夹:Your workspace will contain three folders:

  • src - 此目录包含源代码和包。src - This directory will contain source code and packages. 使用 go get 命令安装的任何包都将在此目录中。Any packages installed with the go get command will go in this directory.
  • pkg - 此目录包含编译的 Go 包对象。pkg - This directory will contain the compiled Go package objects. 这些文件使用 .a 扩展名。These files all have an .a extension.
  • bin - 此目录包含运行 go install 时创建的二进制可执行文件。bin - This directory will contain the binary executable files that are created when you run go install.

提示

若要了解有关 Go 工作区结构的详细信息,请参阅 Go 语言文档To learn more about the structure of a Go workspace, see the Go language documentation. 本指南包含有关设置 $GOPATH$GOROOT 的信息。This guide includes information for setting $GOPATH and $GOROOT.

安装适用于 Go 的客户端库Install the client library for Go

接下来,安装适用于 Go 的客户端库:Next, install the client library for Go:

go get -u https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

或者,如果使用 dep,则在存储库中运行:or if you use dep, within your repo run:

dep ensure -add https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

创建 Go 应用程序Create a Go application

接下来,在 src 目录中创建名为 sample-app.go 的文件:Next, create a file in the src directory named sample-app.go:

cd src
touch sample-app.go

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

在首选 IDE 或文本编辑器中打开 sample-app.goOpen sample-app.go in your preferred IDE or text editor.

在脚本的根目录中声明上下文。Declare a context at the root of your script. 你将需要此对象来执行大多数计算机视觉函数调用。You'll need this object to execute most Computer Vision function calls.

查找订阅密钥和终结点Find the subscription key and endpoint

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的计算机视觉资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到订阅密钥和终结点 。You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

为计算机视觉订阅密钥和终结点创建变量。Create variables for your Computer Vision subscription key and endpoint. 将你的订阅密钥和终结点粘贴到以下代码中的指定位置。Paste your subscription key and endpoint into the following code where indicated. 计算机视觉终结点的格式为 https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "github.com/Azure/azure-sdk-for-go/services/cognitiveservices/v2.0/computervision"
    "github.com/Azure/go-autorest/autorest"
    "io"
    "log"
    "os"
    "strings"
    "time"
)

// Declare global so don't have to pass it to all of the tasks.
var computerVisionContext context.Context

func main() {
    computerVisionKey := "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE"
    endpointURL := "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"

重要

记住在完成后将订阅密钥从代码中删除,永远不要公开发布它。Remember to remove the subscription key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

接下来,你将开始添加代码以执行不同的 OCR 操作。Next, you'll begin adding code to carry out different OCR operations.

对象模型Object model

以下类和接口用于处理 OCR Go SDK 的某些主要功能。The following classes and interfaces handle some of the major features of the OCR Go SDK.

名称Name 说明Description
BaseClientBaseClient 所有计算机视觉功能(如图像分析和文本阅读)都需要此类。This class is needed for all Computer Vision functionality, such as image analysis and text reading. 可以使用订阅信息实例化此类,然后使用它来执行大多数图像操作。You instantiate it with your subscription information, and you use it to do most image operations.
ReadOperationResultReadOperationResult 此类型包含批读取操作的结果。This type contains the results of a Batch Read operation.

代码示例Code examples

这些代码片段演示如何使用适用于 Go 的 OCR 客户端库执行以下任务:These code snippets show you how to do the following tasks with the OCR client library for Go:

验证客户端Authenticate the client

备注

此步骤假设已经为计算机视觉密钥和终结点(分别名为 COMPUTER_VISION_SUBSCRIPTION_KEYCOMPUTER_VISION_ENDPOINT创建了环境变量This step assumes you've created environment variables for your Computer Vision key and endpoint, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT respectively.

创建 main 函数,并向其添加以下代码,以使用终结点和密钥实例化客户端。Create a main function and add the following code to it to instantiate a client with your endpoint and key.

/*  
 * Configure the Computer Vision client
 */
computerVisionClient := computervision.New(endpointURL);
computerVisionClient.Authorizer = autorest.NewCognitiveServicesAuthorizer(computerVisionKey)

computerVisionContext = context.Background()
/*
 * END - Configure the Computer Vision client
 */

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以读取图像中的可见文本,并将其转换为字符流。The OCR service can read visible text in an image and convert it to a character stream. 本部分的代码定义了函数 RecognizeTextReadAPIRemoteImage,该函数使用客户端对象检测并提取图像中的印刷体文本或手写文本。The code in this section defines a function, RecognizeTextReadAPIRemoteImage, which uses the client object to detect and extract printed or handwritten text in the image.

main 函数中添加示例图像引用和函数调用。Add the sample image reference and function call in your main function.

// Analyze text in an image, remote
BatchReadFileRemoteImage(computerVisionClient, printedImageURL)

提示

还可以从本地图像提取文本。You can also extract text from a local image. 请参阅 BaseClient 方法,例如 BatchReadFileInStream。See the BaseClient methods, such as BatchReadFileInStream. 或者,请参阅 GitHub 上的示例代码,了解涉及本地图像的方案。Or, see the sample code on GitHub for scenarios involving local images.

调用读取 APICall the Read API

定义用于读取文本的新函数 RecognizeTextReadAPIRemoteImageDefine the new function for reading text, RecognizeTextReadAPIRemoteImage. 添加以下代码,该代码对给定图像调用 BatchReadFile 方法。Add the code below, which calls the BatchReadFile method for the given image. 此方法返回一个操作 ID 并启动异步进程来读取图像的内容。This method returns an operation ID and starts an asynchronous process to read the content of the image.

func BatchReadFileRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("BATCH READ FILE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // The response contains a field called "Operation-Location", 
    // which is a URL with an ID that you'll use for GetReadOperationResult to access OCR results.
    textHeaders, err := client.BatchReadFile(computerVisionContext, remoteImage)
    if err != nil { log.Fatal(err) }

    // Use ExtractHeader from the autorest library to get the Operation-Location URL
    operationLocation := autorest.ExtractHeaderValue("Operation-Location", textHeaders.Response)

    numberOfCharsInOperationId := 36
    operationId := string(operationLocation[len(operationLocation)-numberOfCharsInOperationId : len(operationLocation)])

获取读取结果Get Read results

接下来,获取 BatchReadFile 调用返回的操作 ID,并将其用于 GetReadOperationResult 方法,向服务查询操作结果 。Next, get the operation ID returned from the BatchReadFile call, and use it with the GetReadOperationResult method to query the service for operation results. 下面的代码每隔一秒钟检查一次操作,直到返回结果。The following code checks the operation at one-second intervals until the results are returned. 然后,它将提取的文本数据输出到控制台。It then prints the extracted text data to the console.

readOperationResult, err := client.GetReadOperationResult(computerVisionContext, operationId)
if err != nil { log.Fatal(err) }

// Wait for the operation to complete.
i := 0
maxRetries := 10

fmt.Println("Recognizing text in a remote image with the batch Read API ...")
for readOperationResult.Status != computervision.Failed &&
        readOperationResult.Status != computervision.Succeeded {
    if i >= maxRetries {
        break
    }
    i++

    fmt.Printf("Server status: %v, waiting %v seconds...\n", readOperationResult.Status, i)
    time.Sleep(1 * time.Second)

    readOperationResult, err = client.GetReadOperationResult(computerVisionContext, operationId)
    if err != nil { log.Fatal(err) }
}

显示读取结果Display Read results

添加以下代码来分析和显示检索到的文本数据,并完成函数定义。Add the following code to parse and display the retrieved text data, and finish the function definition.

// Display the results.
fmt.Println()
for _, recResult := range *(readOperationResult.RecognitionResults) {
    for _, line := range *recResult.Lines {
        fmt.Println(*line.Text)
    }
}

运行应用程序Run the application

从应用程序目录使用 go run 命令运行应用程序。Run the application from your application directory with the go run command.

go run sample-app.go

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps

  • 可以在 GitHub 上找到此示例的源代码。The source code for this sample can be found on GitHub.

使用光学字符识别 REST API 读取印刷体文本和手写文本。Use the Optical character recognition REST API to read printed and handwritten text.

备注

此快速入门使用 cURL 命令来调用 REST API。This quickstart uses cURL commands to call the REST API. 也可以使用编程语言调用 REST API。You can also call the REST API using a programming language. 请参阅 GitHub 示例,查看 C#PythonJavaJavaScriptGo 的相关示例。See the GitHub samples for examples in C#, Python, Java, JavaScript, and Go.

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 已安装 cURLcURL installed

读取印刷体文本和手写文本Read printed and handwritten text

OCR 服务可以读取图像中的可见文本,并将其转换为字符流。The OCR service can read visible text in an image and convert it to a character stream.

要创建和运行示例,请执行以下步骤:To create and run the sample, do the following steps:

  1. 将以下命令复制到文本编辑器中。Copy the following command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. <subscriptionKey> 的值替换为你的订阅密钥。Replace the value of <subscriptionKey> with your subscription key.
    2. 将请求 URL 的第一部分 (chinaeast2) 替换为你自己的终结点 URL 中的文本。Replace the first part of the request URL (chinaeast2) with the text in your own endpoint URL.

      备注

      2019 年 7 月 1 日之后创建的新资源将使用自定义子域名。New resources created after July 1, 2019, will use custom subdomain names. 获取详细信息和区域终结点的完整列表。For more information and a complete list of regional endpoints.

    3. (可选)将请求正文中的图像 URL (http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\) 更改为要分析的其他图像的 URL。Optionally, change the image URL in the request body (http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\) to the URL of a different image to be analyzed.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -H "Content-Type: application/json" "https://chinaeast2.api.cognitive.azure.cn/vision/v3.1/analyze?visualFeatures=Categories,Description&details=Landmarks" -d "{\"url\":\"http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\"}"

检查响应Examine the response

成功的响应以 JSON 格式返回。A successful response is returned in JSON. 示例应用程序会在命令提示符窗口中分析和显示成功响应,如下例所示:The sample application parses and displays a successful response in the command prompt window, similar to the following example:

{
  "categories": [
    {
      "name": "outdoor_water",
      "score": 0.9921875,
      "detail": {
        "landmarks": []
      }
    }
  ],
  "description": {
    "tags": [
      "nature",
      "water",
      "waterfall",
      "outdoor",
      "rock",
      "mountain",
      "rocky",
      "grass",
      "hill",
      "covered",
      "hillside",
      "standing",
      "side",
      "group",
      "walking",
      "white",
      "man",
      "large",
      "snow",
      "grazing",
      "forest",
      "slope",
      "herd",
      "river",
      "giraffe",
      "field"
    ],
    "captions": [
      {
        "text": "a large waterfall over a rocky cliff",
        "confidence": 0.916458423253597
      }
    ]
  },
  "requestId": "b6e33879-abb2-43a0-a96e-02cb5ae0b795",
  "metadata": {
    "height": 959,
    "width": 1280,
    "format": "Jpeg"
  }
}

后续步骤Next steps

深入探索 OCR API。Explore the OCR API in more depth. 要快速体验 API,请尝试使用开放 API 测试控制台To rapidly experiment with the API, try the Open API testing console.