Quickstart: Extract printed text (OCR) using the Computer Vision REST API and Ruby

09/27/2024

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a Trial before you begin.

Prerequisites

You must have Ruby 2.4.x or later installed.
You must have a subscription key for Computer Vision. You can follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

Create and run the sample

To create and run the sample, do the following steps:

Copy the following code into a text editor.
Make the following changes in code where needed:
1. Replace <Subscription Key> with your subscription key.
2. Replace https://api.cognitive.azure.cn/vision/v2.0/ocr with the endpoint URL for the OCR method in the Azure region where you obtained your subscription keys, if necessary.
3. Optionally, replace https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\ with the URL of a different image from which you want to extract printed text.
Save the code as a file with an .rb extension. For example, get-printed-text.rb.
Open a command prompt window.
At the prompt, use the ruby command to run the sample. For example, ruby get-printed-text.rb.

require 'net/http'

uri = URI('https://api.cognitive.azure.cn/vision/v2.0/ocr')
uri.query = URI.encode_www_form({
    # Request parameters
    'language' => 'unk',
    'detectOrientation' => 'true'
})

request = Net::HTTP::Post.new(uri.request_uri)

# Request headers
# Replace <Subscription Key> with your valid subscription key.
request['Ocp-Apim-Subscription-Key'] = '<Subscription Key>'
request['Content-Type'] = 'application/json'

request.body =
    "{\"url\": \"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" +
    "Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\"}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|
    http.request(request)
end

puts response.body

Examine the response

A successful response is returned in JSON. The sample parses and displays a successful response in the command prompt window, similar to the following example:

{
  "language": "en",
  "textAngle": -2.0000000000000338,
  "orientation": "Up",
  "regions": [
    {
      "boundingBox": "462,379,497,258",
      "lines": [
        {
          "boundingBox": "462,379,497,74",
          "words": [
            {
              "boundingBox": "462,379,41,73",
              "text": "A"
            },
            {
              "boundingBox": "523,379,153,73",
              "text": "GOAL"
            },
            {
              "boundingBox": "694,379,265,74",
              "text": "WITHOUT"
            }
          ]
        },
        {
          "boundingBox": "565,471,289,74",
          "words": [
            {
              "boundingBox": "565,471,41,73",
              "text": "A"
            },
            {
              "boundingBox": "626,471,150,73",
              "text": "PLAN"
            },
            {
              "boundingBox": "801,472,53,73",
              "text": "IS"
            }
          ]
        },
        {
          "boundingBox": "519,563,375,74",
          "words": [
            {
              "boundingBox": "519,563,149,74",
              "text": "JUST"
            },
            {
              "boundingBox": "683,564,41,72",
              "text": "A"
            },
            {
              "boundingBox": "741,564,153,73",
              "text": "WISH"
            }
          ]
        }
      ]
    }
  ]
}

Next steps

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the Computer Vision REST API and Ruby

Prerequisites

Create and run the sample

Examine the response

Next steps

Additional resources