Think through your data collection strategy from start to finish and consider a pilot run. This will highlight any issues with your tools or instruments and help ensure that you can process any data you produce.
Avoid unnecessary data entry later on by using built-in features in your capture devices to document as you go. Just make sure you understand and keep track of any preprocessing that might be happening behind the scenes.
Be sure to keep secure and backed-up copies of your data in their rawest form (prior to cleaning or processing). You may also want to save snapshots of your datasets at various stages of processing.
Make sure that your project complies with all applicable laws, regulations, and UT policies.
Know your source
Find a repository of data relevant to your discipline
re3data.org, a global registry of data repositories, can help you locate subject-specific data sources and determine whether they are appropriate for you
Cite data sources
Citing data sources is just as important as citing journal articles, books, or other resources you make use of to produce your research. It allows researchers to locate and repurpose data, promotes reproducibility, and allows you to give and get credit for data products: increasingly viewed as scholarly output in their own right.
Ask for guidance
Find your subject specialist for help locating existing data sets relevant to your research question.
Keep your data secure
Save your raw data. This allows you to start over if something goes wrong, or to re-analyze the same dataset testing different variables or protocols.
Consider saving snapshots of your data at a number of different stages (e.g., raw, cleaned up, subsetted).
Distinguish between these datasets in the file names and/or documentation.
In projects that involve code or software development where there are frequent edits or multiple contributors, consider using a more elaborate version control system. Git is a popular choice, but your research community or lab may have a preferred environment.
Back things up
Maintaining working copies of your data requires thoughtful consideration of hardware, redundant storage locations, and a disaster plan.
Lots of Copies Keeps Stuff Safe is a helpful acronym to remember.
The more copies of your data, the better...as long as they’re not all in the same place.
Test your system frequently to make sure it’s working.
Use the 3-2-1 backup rule as rule of thumb: 3 copies, on 2 different types of storage media, 1 off-site.
Document your steps
This can mean taking good notes, saving log files, or capturing your every step in an electronic lab book. Be sure to keep a copy together with any data or code you produce so that you can follow your trail later on.
Scan paper notebooks. Include any pre-processing or data-cleaning steps to ensure reproducibility.
REDCap is a secure web application for building and managing online surveys and databases. While REDCap can be used to collect virtually any type of data (including 21 CFR Part 11, FISMA, and HIPAA-compliant environments), it is specifically geared to support online or offline data capture for research studies and operations.