OCR stands for optical character recognition, and is an automated method of creating machine-readable texts. An OCR program will interpret the text on a digital image and attempt to render it in a known alphabet. Often, you need to specify for the OCR software what language(s) are in the digital image, and sometimes you also need to help the software understand the formatting of the text in the image (for example, with columns of text, photographs, and advertisements in a newspaper). There are many notable OCR-focused projects, including Mapping Texts, Eighteenth Century Collections Online, and the Nusus Corpus. Learn More
You can access the slides for this presentation at this link.
You can access the recording of this workshop at this link.