快速入门:使用计算机视觉 REST API 和 JavaScript 提取印刷体文本 (OCR)Quickstart: Extract printed text (OCR) using the Computer Vision REST API and JavaScript

本快速入门将使用计算机视觉 REST API 通过光学字符识别 (OCR) 从图像中提取印刷体文本。In this quickstart, you'll extract printed text with optical character recognition (OCR) from an image using the Computer Vision REST API. 借助 OCR 方法,可检测图像中的印刷体文本,并将识别的字符提取到计算机可用的字符流中。With the OCR method, you can detect printed text in an image and extract recognized characters into a machine-usable character stream.

先决条件Prerequisites

  • Azure 订阅 - 创建试用订阅An Azure subscription - Create one for trial
  • 拥有 Azure 订阅后,在 Azure 门户中创建计算机视觉资源 ,获取密钥和终结点。Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 需要从创建的资源获取密钥和终结点,以便将应用程序连接到计算机视觉服务。You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

创建并运行示例Create and run the sample

要创建和运行示例,请执行以下步骤:To create and run the sample, do the following steps:

  1. 创建一个名为 get-printed-text.html 的文件,在文本编辑器中打开它,然后将以下代码复制到其中。Create a file called get-printed-text.html, open it in a text editor, and copy the following code into it.
  2. (可选)将 inputImage 控件的 value 属性的值替换为要分析的其他图像的 URL。Optionally, replace the value of the value attribute for the inputImage control with the URL of a different image that you want to analyze.
  3. 打开一个浏览器窗口。Open a browser window.
  4. 在浏览器中,将文件拖放到浏览器窗口。In the browser, drag and drop the file into the browser window.
  5. 当网页显示在浏览器中时,请将订阅密钥和终结点 URL 粘贴到相应的输入框中。When the webpage is displayed in the browser, paste your subscription key and endpoint URL into the appropriate input boxes.
  6. 选择“读取图像”按钮。Select the Read image button.
<!DOCTYPE html>
<html>
<head>
    <title>OCR Sample</title>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
    function processImage() {
        // **********************************************
        // *** Update or verify the following values. ***
        // **********************************************

        var subscriptionKey = document.getElementById("subscriptionKey").value;
        var endpoint = document.getElementById("endpointUrl").value;
        
        var uriBase = endpoint + "vision/v3.0/ocr";

        // Request parameters.
        var params = {
            "language": "unk",
            "detectOrientation": "true",
        };

        // Display the image.
        var sourceImageUrl = document.getElementById("inputImage").value;
        document.querySelector("#sourceImage").src = sourceImageUrl;

        // Perform the REST API call.
        $.ajax({
            url: uriBase + "?" + $.param(params),

            // Request headers.
            beforeSend: function(jqXHR){
                jqXHR.setRequestHeader("Content-Type","application/json");
                jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key", subscriptionKey);
            },

            type: "POST",

            // Request body.
            data: '{"url": ' + '"' + sourceImageUrl + '"}',
        })

        .done(function(data) {
            // Show formatted JSON on webpage.
            $("#responseTextArea").val(JSON.stringify(data, null, 2));
        })

        .fail(function(jqXHR, textStatus, errorThrown) {
            // Display error message.
            var errorString = (errorThrown === "") ?
                "Error. " : errorThrown + " (" + jqXHR.status + "): ";
            errorString += (jqXHR.responseText === "") ? "" :
                (jQuery.parseJSON(jqXHR.responseText).message) ?
                    jQuery.parseJSON(jqXHR.responseText).message :
                    jQuery.parseJSON(jqXHR.responseText).error.message;
            alert(errorString);
        });
    };
</script>

<h1>Optical Character Recognition (OCR):</h1>
Enter the URL to an image of printed text, then
click the <strong>Read image</strong> button.
<br><br>
Subscription key: 
<input type="text" name="subscriptionKey" id="subscriptionKey"
    value="" /> 
Endpoint URL:
<input type="text" name="endpointUrl" id="endpointUrl"
    value="" />
<br><br>
Image to read:
<input type="text" name="inputImage" id="inputImage" 
    value="https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png" />
<button onclick="processImage()">Read image</button>
<br><br>
<div id="wrapper" style="width:1020px; display:table;">
    <div id="jsonOutput" style="width:600px; display:table-cell;">
        Response:
        <br><br>
        <textarea id="responseTextArea" class="UIInput"
                  style="width:580px; height:400px;"></textarea>
    </div>
    <div id="imageDiv" style="width:420px; display:table-cell;">
        Source image:
        <br><br>
        <img id="sourceImage" width="400" />
    </div>
</div>
</body>
</html>

检查响应Examine the response

成功的响应以 JSON 格式返回。A successful response is returned in JSON. 示例网页会在浏览器窗口中分析和显示成功响应,如下例所示:The sample webpage parses and displays a successful response in the browser window, similar to the following example:

{
  "language": "en",
  "orientation": "Up",
  "textAngle": 0,
  "regions": [
    {
      "boundingBox": "21,16,304,451",
      "lines": [
        {
          "boundingBox": "28,16,288,41",
          "words": [
            {
              "boundingBox": "28,16,288,41",
              "text": "NOTHING"
            }
          ]
        },
        {
          "boundingBox": "27,66,283,52",
          "words": [
            {
              "boundingBox": "27,66,283,52",
              "text": "EXISTS"
            }
          ]
        },
        {
          "boundingBox": "27,128,292,49",
          "words": [
            {
              "boundingBox": "27,128,292,49",
              "text": "EXCEPT"
            }
          ]
        },
        {
          "boundingBox": "24,188,292,54",
          "words": [
            {
              "boundingBox": "24,188,292,54",
              "text": "ATOMS"
            }
          ]
        },
        {
          "boundingBox": "22,253,297,32",
          "words": [
            {
              "boundingBox": "22,253,105,32",
              "text": "AND"
            },
            {
              "boundingBox": "144,253,175,32",
              "text": "EMPTY"
            }
          ]
        },
        {
          "boundingBox": "21,298,304,60",
          "words": [
            {
              "boundingBox": "21,298,304,60",
              "text": "SPACE."
            }
          ]
        },
        {
          "boundingBox": "26,387,294,37",
          "words": [
            {
              "boundingBox": "26,387,210,37",
              "text": "Everything"
            },
            {
              "boundingBox": "249,389,71,27",
              "text": "else"
            }
          ]
        },
        {
          "boundingBox": "127,431,198,36",
          "words": [
            {
              "boundingBox": "127,431,31,29",
              "text": "is"
            },
            {
              "boundingBox": "172,431,153,36",
              "text": "opinion."
            }
          ]
        }
      ]
    }
  ]
}

后续步骤Next steps

接下来,了解一款使用计算机视觉执行光学字符识别 (OCR) 功能的 JavaScript 应用程序;创建智能裁剪缩略图;对图像中的视觉特征进行检测、分类、标记和说明。Next, explore a JavaScript application that uses Computer Vision to perform optical character recognition (OCR); create smart-cropped thumbnails; and detect, categorize, tag, and describe visual features in images.