Skip to Main Content
University of Texas University of Texas Libraries

Finding Humanities Data

A brief compendium of free datasets, with a focus on materials of relevance to European Studies.

Working with Humanities Data

Working with Humanities Data

  • Text Analysis Tools: These digital tools help scholars examine and interpret textual data. They offer methods of analysis that include, but are not limited to, identifying patterns, themes, and structures. Additionally, they are useful for visualizing data and tracing relationships or trends that might not be immediately noticeable through traditional close reading.
    • Voyant Tools to visualize textual data through word clouds, frequency graphs, and keyword analysis
    • (Python) for web development, data analysis, machine learning, and automation
  • Image and Audio Analysis Tools: These tools help analyze visual and auditory materials, aiding research focused on studying elements such as visual composition, patterns, metadata, and sound features. This approach is particularly useful for identifying features that might be difficult to detect manually. Additionally, these tools excel at visualizing data and enhancing accessibility.
    • Tropy for organizing and annotating images
    • OpenCV for image processing
    • Audacity for audio analysis
  • Data Cleaning and Preparation: Many humanities datasets require cleaning. This includes removing OCR errors in text, handling null values in structured data) to be usable in analysis.

Ethical Considerations

  • Privacy and Access: Be aware of ethical concerns related to privacy, especially if you are working with sensitive or personally identifiable information, and ensure that data access aligns with copyright restrictions. 

  • Representation and Bias: Acknowledge the potential biases inherent in humanities data and the need for critical engagement with sources, especially with archives that might exclude marginalized voices.

  • Citing Data: Reference our research guides for resources on how to properly cite data in humanities research, which differs from traditional citation practices.

Case Studies

We’ve provided a few examples of how humanities data is been used in projects: 

  • Analyzing language trends in literature (using Google Books Ngram Viewer).

  • Mapping historical events or migrations using GIS data.

  • Conducting sentiment analysis in speeches or letters.

Additional Resources and Tutorials

  • Learning Platforms: Resources like The Programming Historian, Data Carpentry, or Library Carpentry offer tutorials specifically for humanities researchers.

  • Reading List: Reference books like Digital_Humanities by Anne Burdick et al., or Debates in the Digital Humanities edited by Matthew K. Gold discuss the field’s evolving relationship with data.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.