When considering a digital scholarship project, it is essential to identify your research question first. Some researchers try to approach digital scholarship by first choosing a method or tool and then deciding on a research question. However, this approach is not sustainable in the long term.
Be curious, ask why things are the way they are. Identify a general interesting topic that you would like to research. Carefully examine the existing literature on your topic of interest to learn more about what others have already done or are doing. A research question should be clear, focused, complex, and arguable:
Is your research question clear?
It should provide enough specifics that your audience can easily understand its purpose without needing additional explanation.
Is your research question focused?
It should be narrow enough that it can be answered thoroughly in the space that the assignment allows.
Is your research question complex?
It should not be answerable with a simple “yes” or “no,” but rather requires synthesis and analysis of ideas and sources prior to producing any answer.
Is your research question arguable?
Its potential answers should be open to debate rather than exist as accepted facts.
Adapted from: “How to Write a Research Question.” n.d. The Writing Center. https://writingcenter.gmu.edu/writing-resources/research-based-writing/how-to-write-a-research-question.
Before digitizing your materials, you will want to consider the end-use case and the legality of creating a digital copy. As of 1 January 2024, books published in the US before 1929 and sound recordings published before 1924 are considered to be public domain. If your material still retains copyright status, or its copyright status is unknown, you may still be able to create a digital reproduction of the material under the Fair Use Doctrine. In short, this doctrine allows the use and reproduction of some copyrighted material under specific circumstances on a case-by-case basis, such as for academic research and educational purposes. Cornell University Library has a good checklist that can help determine fair use status when using copyrighted materials.
When seeking to digitize materials owned by a library, archive, or other institution, there may be other restrictions that apply. If in doubt, refer back to the owning institution’s policies and contact their staff for specific questions.
For further general information about fair use and copyright in libraries and archives, see the American Library Association’s resource guide and this easy to follow chart on what is considered public domain in the United States from Cornell University Library.
For professional legal advice, contact an intellectual property attorney.
When scanning material, consider the following:
Text recognition, also known as Optical Character Recognition (OCR), is the conversion of images with text or handwritten text into machine-encoded text. In other words, it's the process that makes your physical item (book, newspaper, pamphlet, etc) something that a computer can understand and manipulate.
The following tools allow you to perform OCR on a variety of textual materials, such as newspapers, handwritten documents, and computer-generated texts. You can find even more tool recommendations here.
To learn more about text recognition see the OCR LibGuide.
Once your text is transcribed, you might want to use various text analysis methods. These methods will assist you in analyzing and visualizing the data extracted from your texts.
Analysis of a text can also be based on various linguistic features, such as word frequencies, sentence lengths, and other peculiarities of an author’s style. Text analysis can be performed using a variety of tools like
Additionally, exploring the programming language Python can be a great place to start. There are many existing Python packages and tutorials focused on text analysis that can help you get started.
It is important to think about a long-term plan from the earliest outset of your project so that you can set aside enough time and resources to ensure that your data will be accessible long after your project is over.
Publication in a digital repository can provide persistent URLs or a digital object identifier (DOI), full-text indexing and long-term preservation.
Publishing your work outside of a journal’s paywall will help your work become more discoverable to a wider audience. Need more reasons?
This guide page on Archiving and Sharing Your Work provides more info on increasing access to your work
When looking to store your work in a repository, consider using one provided by UT. Some benefits of using a UT repository for your work are:
Discipline Specific Repositories:
Please contact us for assistance with your project using this form.
Humanities Data Curation Checklist
Created by Adriana Cásarez in spring 2020, this checklist guides humanities researchers and humanities liaison librarians on key considerations for making their data findable, accessible and clear to interested scholars and institutions.
Data Curation in the Texas Data Repository
Spring 2020 capstone report from Brenna Wheeler. Brenna created a data curation workflow based on the Data Curation Network’s CURATE(D) Model to improve the findability and reusability of datasets. The workflow is localized to the Texas Data Repository using needs identified in interviews with academic librarians and assessment of datasets currently in the repository. The final product is a specialized Data Curation workflow and a list of recommendations that may be used by a team of liaison librarians to curate newly deposited datasets in the future.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.