Considerations about managing and sharing potentially sensitive research data often come up specifically in the context of peer review and publication of other scholarly objects like journal articles or books. This page provides an overview of some considerations that may arise when interfacing with other entities such as academic publishers and external data repositories.
Preprints are not typically peer-reviewed, and although many preprint servers have some level of content moderation, this is typically only to ensure baseline compliance with things such as minimum metadata (e.g., author contact information), no ad hominen attacks, and appropriate content (e.g., in alignment with a topical focus, exclusion of course materials). Some preprint servers (e.g., medRxiv) will also screen for sensitive human data, but this is still not a peer review process, and researchers should neither rely on, nor treat, the preprint screening process as a complete assessment of potential sensitivity of any publicly disclosed information. In particular, preprint servers that accept content from any discipline (or platforms that host preprints alongside other content, such as Figshare) will lack the necessary expertise (and perhaps staffing) to conduct even basic screenings checks for sensitive content.
Researchers should be prepared to answer some screening questions related to responsible conduct of research (e.g., documentation supporting approval for research on humans or animals) during the manuscript submission process; this will be particularly common for discipline-specific journals. These typically do not pertain to how associated data will be disseminated.
Peer review is not designed to screen for ethical and legal issues related to how sensitive data are being managed and shared. Although reviewers certainly can flag various ethical concerns related to how a study was carried out, this is not the primary goal of peer review in many disciplines, and reviewers are not often directed to focus on areas such as how the authors intend to share (or not share) sensitive research data. Editors vary widely in their level of involvement and familiarity with a topic and with the manuscript and also cannot be relied upon to be a secure check for ethical data sharing. Researchers should neither assume nor claim that because a manuscript has been accepted following peer review that all aspects of its publication (data sharing in this case) are ethically and legally sound.
Many journals have policies requiring data sharing, although they often have carve-outs for certain situations, sensitivity being one of them. If you anticipate not being able to share your data at all or only through restricted access means, you should be sure to communicate this early in the process with the journal to ensure that your intended management is acceptable. For example, if you submit a manuscript with the associated data and do not give an indication that the data cannot be made available, you may run into problems later if you bring up the fact that you actually cannot share the data. Reputable journals should be able to accommodate data sharing restrictions provided that the justification is made clear and that means of accessing the data are detailed.
Many data repositories, especially those that accept content from across disciplines (generalists like Figshare, OSF, Zenodo) will conduct little to no content moderation or screening upon initial deposit. Usually they only require depositors to attest that they have not submitted any sensitive data and that any content subsequently identified as sensitive can be removed at the platform's discretion. Other repositories may perform some level of curation that is similar to that of many preprint servers in which they are mainly focused on ensuring alignment with the general scope and discipline-agnostic standards (e.g., minimum metadata like descriptive titles), but there are little to no checks on discipline-specific attributes like sensitive data.
Finally, some data repositories (e.g., Qualitative Data Repository, many NIH-managed repositories, and ICPSR), are specifically designed for sensitive data and will have both the infrastructure and staff necessary to perform detailed checks on both the ingest of data and any future sharing of the data. These are the only types of repositories that you should consider as having performed a comprehensive review of data for sensitivity.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.