Skip to Main Content
University of Texas University of Texas Libraries

Introduction to Optical Character Recognition Workshop

This guide was created to support the Introduction to Optical Character Recognition DH workshop in Spring 2024: This workshop introduces the basics of optical character recognition (OCR), which allows for full-text searching and other types of text man

Entry-level: Google Docs

Tutorial

This is the quickest method for performing optical character recognition on an image of text. Basically, you are asking Google Docs to convert your image file into a document or text file, and with Google Drive/Google Workspace's built-in OCR capabilities, this works pretty well.

Dr. Chris Rose has a great tutorial on how to do this (his example is Arabic, but you should be able to do this with many languages).

Using Google Drive and Google Docs to OCR an Image

  1. Upload your image file to Google Drive.
  2. In Google Drive, right click on the image file OR click on the three dots at the right side of the image file's row. You can select "Open With" from either menu, and then select "Google Docs"
  3. Let the magic happen. This may take a minute.
  4. Your image will open in a Google Doc with the image on the first page and the OCR'd text below it. Voilá!
  5. Depending on your project, you may need to correct and format the text to fit your needs.

Note: If you are working with a multi-columned text, take a screenshot or snippet of each column in order to simplify the zoning/reading order for Google Docs. Dr. Rose's blog post discusses this process.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.