Dealing with the mountains of digital data that accumulate in the course of a research project can seem like a daunting process, especially if your work is collaborative or stretches over several years. Adopting a few key good habits early on can save you huge amounts of time, money, and frustration searching for things and recovering lost files. This workshop will provide a general introduction to core data management concepts, practical tips for things like backups and file formats, and a wealth of information about tools and resources available to UT faculty, staff, and students.
Download slides here.
Choosing file formats carefully helps avoid obsolescence. Use formats that are:
Non-proprietary, open, documented standards (e.g., .tif, .txt, .csv, .pdf)
Encoded with standard characters (e.g., ASCII, UTF-8)
Used commonly in your research community
Adopt a naming convention and use it throughout a project (or throughout your career).
Describe the contents of the file, but not be overly long. Avoid generic names (like draft.doc; final2.xls) that can be hard to decipher and easily overwritten.
Include dates. Don’t rely on system dates, which can be misleading. Recommended formats look like: YYYYMMDD or YYYY-MM-DD.
Reserve 3-letter file extensions for application-specific codes (e.g., .jpg, .mov, .tif).
Don't use special characters like "/ \ : * ? " < > [ ] & $. These have meaning in software and operating systems and can cause trouble.
Try to avoid spaces too. These are problematic for some operating systems. Use underscores (file_name), dashes (file-name), or camel case (FileName) instead.
This can be the biggest time suck of all. Don't forget to budget time and use reproducible steps for making your data tidy.
This work is licensed under a Creative Commons Attribution-NonCommercial 2.0 Generic License.