Skip to Main Content
University of Texas University of Texas Libraries

Digital Humanities Workshops @PCL

Schedule and course content from Digital Humanities Workshops @PCL series

OCR and Machine Translation: Get Your Mechanical Turk On

Workshop Description

Location: PCL Data Lab (in Scholars Commons)

Automated translation is an opportunity to expand the limits of scholarship. Global access to files has not necessarily translated into global scholarship because of the linguistic limits of scholars themselves. This workshop will teach you a workaround. You will learn how to use Google Tesseract to convert images of documents into text files and then translate those documents using a translation API. In the process, you will learn the limits of automation and how computers can assist, but not replace, human translation. No prior programming experience is required.

Instructor: Andrew Akhlaghi 

Workshop Materials

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 2.0 Generic License.