Quickstart: Extract printed text (OCR) using the Computer Vision REST API and PHP

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a Trial before you begin.

Prerequisites

  • You must have PHP installed.
  • You must have Pear installed.
  • You must have a subscription key for Computer Vision. You can follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

Create and run the sample

To create and run the sample, do the following steps:

  1. Install the PHP5 HTTP_Request2 package.

    1. Open a command prompt window as an administrator.

    2. Run the following command:

      pear install HTTP_Request2
      
    3. After the package is successfully installed, close the command prompt window.

  2. Copy the following code into a text editor.

  3. Make the following changes in code where needed:

    1. Replace the value of subscriptionKey with your subscription key.
    2. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where you obtained your subscription keys, if necessary.
    3. Optionally, replace the value of imageUrl with the URL of a different image from which you want to extract printed text.
  4. Save the code as a file with a .php extension. For example, get-printed-text.php.

  5. Open a browser window with PHP support.

  6. Drag and drop the file into the browser window.

<?php
<html>
<head>
    <title>OCR Sample</title>
</head>
<body>
<?php
// Replace <Subscription Key> with a valid subscription key.
$ocpApimSubscriptionKey = '<Subscription Key>';

$uriBase = 'https://api.cognitive.azure.cn/vision/v2.0/';

$imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/' .
    'Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . 'ocr');
$url = $request->getUrl();

$headers = array(
    // Request headers
    'Content-Type' => 'application/json',
    'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey
);
$request->setHeader($headers);

$parameters = array(
    // Request parameters
    'language' => 'unk',
    'detectOrientation' => 'true'
);
$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters
$body = json_encode(array('url' => $imageUrl));

// Request body
$request->setBody($body);

try
{
    $response = $request->send();
    echo "<pre>" .
        json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";
}
catch (HttpException $ex)
{
    echo "<pre>" . $ex . "</pre>";
}
?>
</body>
</html>

Examine the response

A successful response is returned in JSON. The sample website parses and displays a successful response in the browser window, similar to the following example:

{
    "language": "en",
    "orientation": "Up",
    "textAngle": 0,
    "regions": [
        {
            "boundingBox": "21,16,304,451",
            "lines": [
                {
                    "boundingBox": "28,16,288,41",
                    "words": [
                        {
                            "boundingBox": "28,16,288,41",
                            "text": "NOTHING"
                        }
                    ]
                },
                {
                    "boundingBox": "27,66,283,52",
                    "words": [
                        {
                            "boundingBox": "27,66,283,52",
                            "text": "EXISTS"
                        }
                    ]
                },
                {
                    "boundingBox": "27,128,292,49",
                    "words": [
                        {
                            "boundingBox": "27,128,292,49",
                            "text": "EXCEPT"
                        }
                    ]
                },
                {
                    "boundingBox": "24,188,292,54",
                    "words": [
                        {
                            "boundingBox": "24,188,292,54",
                            "text": "ATOMS"
                        }
                    ]
                },
                {
                    "boundingBox": "22,253,297,32",
                    "words": [
                        {
                            "boundingBox": "22,253,105,32",
                            "text": "AND"
                        },
                        {
                            "boundingBox": "144,253,175,32",
                            "text": "EMPTY"
                        }
                    ]
                },
                {
                    "boundingBox": "21,298,304,60",
                    "words": [
                        {
                            "boundingBox": "21,298,304,60",
                            "text": "SPACE."
                        }
                    ]
                },
                {
                    "boundingBox": "26,387,294,37",
                    "words": [
                        {
                            "boundingBox": "26,387,210,37",
                            "text": "Everything"
                        },
                        {
                            "boundingBox": "249,389,71,27",
                            "text": "else"
                        }
                    ]
                },
                {
                    "boundingBox": "127,431,198,36",
                    "words": [
                        {
                            "boundingBox": "127,431,31,29",
                            "text": "is"
                        },
                        {
                            "boundingBox": "172,431,153,36",
                            "text": "opinion."
                        }
                    ]
                }
            ]
        }
    ]
}

Clean up resources

When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. To uninstall the package, do the following steps:

  1. Open a command prompt window as an administrator.

  2. Run the following command:

    pear uninstall HTTP_Request2
    
  3. After the package is successfully uninstalled, close the command prompt window.

Next steps

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console.