快速入门:使用计算机视觉 REST API 和 PHP 提取印刷体文本 (OCR)Quickstart: Extract printed text (OCR) using the Computer Vision REST API and PHP

在本快速入门中,你将使用计算机视觉的 REST API,通过光学字符识别 (OCR) 从图像中提取印刷体文本。In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using Computer Vision's REST API. 借助 OCR 方法,可检测图像中的印刷体文本,并将识别的字符提取到计算机可用的字符流中。With the OCR method, you can detect printed text in an image and extract recognized characters into a machine-usable character stream.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a Trial before you begin.

先决条件Prerequisites

  • 必须安装有 PHPYou must have PHP installed.
  • 必须安装有 PearYou must have Pear installed.
  • 必须具有计算机视觉的订阅密钥。You must have a subscription key for Computer Vision. 你可以按照创建认知服务帐户中的说明订阅计算机视觉并获取密钥。You can follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

创建并运行示例Create and run the sample

要创建和运行示例,请执行以下步骤:To create and run the sample, do the following steps:

  1. 安装 PHP5 HTTP_Request2 包。Install the PHP5 HTTP_Request2 package.

    1. 以管理员身份打开命令提示符窗口。Open a command prompt window as an administrator.

    2. 运行以下命令:Run the following command:

      pear install HTTP_Request2
      
    3. 包安装成功后,关闭命令提示符窗口。After the package is successfully installed, close the command prompt window.

  2. 将以下代码复制到文本编辑器中。Copy the following code into a text editor.

  3. 必要时在代码中进行如下更改:Make the following changes in code where needed:

    1. subscriptionKey 的值替换为你的订阅密钥。Replace the value of subscriptionKey with your subscription key.
    2. 如有必要,请将 uriBase 的值替换为获取的订阅密钥所在的 Azure 区域中的 OCR 方法的终结点 URL。Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where you obtained your subscription keys, if necessary.
    3. (可选)将 imageUrl 的值替换为要从中提取印刷体文本的另一图像的 URL。Optionally, replace the value of imageUrl with the URL of a different image from which you want to extract printed text.
  4. 将代码保存为以 .php 为扩展名的文件。Save the code as a file with a .php extension. 例如,get-printed-text.phpFor example, get-printed-text.php.

  5. 打开具有 PHP 支持的浏览器窗口。Open a browser window with PHP support.

  6. 将文件拖放到浏览器窗口中。Drag and drop the file into the browser window.

<?php
<html>
<head>
    <title>OCR Sample</title>
</head>
<body>
<?php
// Replace <Subscription Key> with a valid subscription key.
$ocpApimSubscriptionKey = '<Subscription Key>';

$uriBase = 'https://api.cognitive.azure.cn/vision/v2.0/';

$imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/' .
    'Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . 'ocr');
$url = $request->getUrl();

$headers = array(
    // Request headers
    'Content-Type' => 'application/json',
    'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey
);
$request->setHeader($headers);

$parameters = array(
    // Request parameters
    'language' => 'unk',
    'detectOrientation' => 'true'
);
$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters
$body = json_encode(array('url' => $imageUrl));

// Request body
$request->setBody($body);

try
{
    $response = $request->send();
    echo "<pre>" .
        json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";
}
catch (HttpException $ex)
{
    echo "<pre>" . $ex . "</pre>";
}
?>
</body>
</html>

检查响应Examine the response

成功的响应以 JSON 格式返回。A successful response is returned in JSON. 示例网站会在浏览器窗口中分析和显示成功响应,如下例所示:The sample website parses and displays a successful response in the browser window, similar to the following example:

{
    "language": "en",
    "orientation": "Up",
    "textAngle": 0,
    "regions": [
        {
            "boundingBox": "21,16,304,451",
            "lines": [
                {
                    "boundingBox": "28,16,288,41",
                    "words": [
                        {
                            "boundingBox": "28,16,288,41",
                            "text": "NOTHING"
                        }
                    ]
                },
                {
                    "boundingBox": "27,66,283,52",
                    "words": [
                        {
                            "boundingBox": "27,66,283,52",
                            "text": "EXISTS"
                        }
                    ]
                },
                {
                    "boundingBox": "27,128,292,49",
                    "words": [
                        {
                            "boundingBox": "27,128,292,49",
                            "text": "EXCEPT"
                        }
                    ]
                },
                {
                    "boundingBox": "24,188,292,54",
                    "words": [
                        {
                            "boundingBox": "24,188,292,54",
                            "text": "ATOMS"
                        }
                    ]
                },
                {
                    "boundingBox": "22,253,297,32",
                    "words": [
                        {
                            "boundingBox": "22,253,105,32",
                            "text": "AND"
                        },
                        {
                            "boundingBox": "144,253,175,32",
                            "text": "EMPTY"
                        }
                    ]
                },
                {
                    "boundingBox": "21,298,304,60",
                    "words": [
                        {
                            "boundingBox": "21,298,304,60",
                            "text": "SPACE."
                        }
                    ]
                },
                {
                    "boundingBox": "26,387,294,37",
                    "words": [
                        {
                            "boundingBox": "26,387,210,37",
                            "text": "Everything"
                        },
                        {
                            "boundingBox": "249,389,71,27",
                            "text": "else"
                        }
                    ]
                },
                {
                    "boundingBox": "127,431,198,36",
                    "words": [
                        {
                            "boundingBox": "127,431,31,29",
                            "text": "is"
                        },
                        {
                            "boundingBox": "172,431,153,36",
                            "text": "opinion."
                        }
                    ]
                }
            ]
        }
    ]
}

清理资源Clean up resources

不再需要文件时,请将其删除,然后卸载 PHP5 HTTP_Request2 包。When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. 要卸载包,请执行以下步骤:To uninstall the package, do the following steps:

  1. 以管理员身份打开命令提示符窗口。Open a command prompt window as an administrator.

  2. 运行以下命令:Run the following command:

    pear uninstall HTTP_Request2
    
  3. 成功卸载包后,关闭命令提示符窗口。After the package is successfully uninstalled, close the command prompt window.

后续步骤Next steps

了解计算机视觉 API,它用于分析图像、检测名人和地标、创建缩略图,并提取印刷体文本和手写文本。Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. 要快速体验计算机视觉 API,请尝试使用 Open API 测试控制台To rapidly experiment with the Computer Vision API, try the Open API testing console.