Skip to Main Content
University of Texas University of Texas Libraries

Scan Tech Studio (STS)

This guide provides orienting information and tutorials for the Digitization and Text Recognition Hub in the PCL Scholars Lab.

Preserve Your Work

Preserve

It is important to think about a long-term plan from the earliest outset of your project so that you can set aside enough time and resources to ensure that your data will be accessible long after your project is over.

What should I keep?

Be elective about what data you plan to retain, as every file requires some measure of overhead in terms of storage and maintenance for the long term. It’s a good idea to:

  • keep anything irreproducible, such as observations specific to a particular time and place,
  • retain results that are tied to a specific publication or presentation,
  • discard intermediate tests or failed experiments at the end of a project.

How long should I keep it?

Check with your funding agency to find out if there is a specific policy that spells out a data retention period. For publicly funded research in the US, this is often a minimum of three-years. It is better to aim for even longer, if possible, in case you or someone else need the data later on. Five to ten years is a good rule of thumb.

Keep files readable

Making sure your data remain accessible for the long term is a big challenge, especially since technology changes so quickly. Choosing the right file formats can help avoid obsolescence. Use formats that are:

  • Non-proprietary, open, documented standards (e.g., .tif, .txt, .csv, .pdf)
  • Used commonly in your research community
  • Encoded with standard characters (e.g., ASCII, UTF-8)
  • See the Library of Congress guide to file formats which are likely to have long term support https://www.loc.gov/preservation/resources/rfs/index.html

Where should I store my data (in which repository)?

Tools and Resources

  • The Texas Data Repository (TDR) is hosted by the Texas Digital Library, and based on Harvard University’s Dataverse platform, TDR is a long-term solution for the preservation and dissemination of UT’s research data. Affiliates of UT-Austin may deposit and publish datasets of up to 4GB each in TDR free of charge. Published datasets in the TDR are assigned Digital Object Identifiers (DOIs), are publicly accessible, and are free to access and download.
  • Archive of the Indigenous Languages of Latin America (AILLA) AILLA's primary mission is to preserve materials in and about the indigenous languages of Latin America.
  • Inter-university Consortium for Political and Social Research (ICPSR) ICPSR maintains a data archive of more than 250,000 files of research in the social and behavioral sciences. It hosts data collections in education, aging, criminal justice, substance abuse, terrorism, and other fields. Free with UT institutional membership.
  • Qualitative Data Repository (QDR) QDR curates, stores, preserves, publishes, and enables the download of digital data generated through qualitative and multi-method research in the social sciences. 
  • Re3data.org is a global registry of data repositories organized by academic discipline. A rating system and faceted browsing can help you find the best place to deposit your data. Free with UT institutional membership.
  • Scientific Data Recommended Repositories - A list of disciplinary and open repositories that meet the data access, preservation and stability requirements of Nature's Scientific Data journal.​
  • NIH Data Repositories - National Institutes of Health-supported data repositories that make data accessible for reuse. Most accept submissions of appropriate data from NIH-funded investigators (and others), but some restrict data submission to only those researchers involved in a specific research network.
  • Open Access Directory - List of data repositories worldwide
  • Texas ScholarWorks (TSW) is UT’s web-accessible DSpace repository, managed by UT Libraries. A free and secure place for archiving and sharing faculty research output, it provides persistent URLs, searchable metadata, full-text indexing and long-term preservation.

Data Curation

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.