Skip to Main Content
University of Texas University of Texas Libraries

Research Data Services

replacement for my website

Generalist Repositories

Overview

This page provides a brief overview of some of the major generalist repositories (click name to go to specific repository) and highlights certain features (or lack thereof) that may be useful in deciding which repository to use: 

Some other generalist repositories are listed in brief at the bottom of this guide. Researchers may also be interested in the comparison chart from the Generalist Repository Ecosystem Initiative (GREI, which does not include all of these repositories but which goes in much more depth) or their flowchart. Harvard has a trimmed-down version of the comparison chart and covers more of the generalist repositories. Listings do not represent endorsements.

Generalist Repository Ecosystem Initiative

GREI is funded by the National Institutes of Health (NIH) and is a multi-year collaboration/competition with a number of the major generalist repositories to develop a set of standards and norms among generalist repositories and to expand knowledge and best practices in data sharing. GREI maintains a range of materials and communications on various platforms, including documents and resources for researchers across disciplines and institutions (Zenodo community); blogs (Medium); and roadmap (GitHub). NIH does not endorse any of these repositories and recommends that researchers follow general guidance on selecting a data repository.

 

Have questions about generalist repositories? 

If you have questions about understanding a certain repository's services or comparing several, please reach out to Bryan Gee (who has worked for one and published data in several others).

Dryad

General attributes

  • Launched: 2008
  • Maximum file size: ~50 GB
  • Maximum deposit size: 2,000 GB (2 TB)
  • Cost: $150 per deposit up to 10 GB, additional overage charges above 10 GB
  • Organizational structure: Non-profit
  • Based in: United States
  • Websitehttps://datadryad.org/

Pros

  • Relatively high file size and deposit size limits.
  • Curated submission ensures higher quality metadata and greater accessibility of datasets than other generalists without curation.

Cons

  • Does not host non-data objects (e.g., code, supplemental figures) unless they are associated with data and may re-route them to be hosted separately on Zenodo.
  • Curated submission can take several days for staff to review dataset and return it for edits or publish it; DOI will not be active until approved by staff.
  • Will cost UT researchers unless they are publishing a dataset alongside a manuscript in a journal that sponsors the cost of data publication; costs of datasets over 10 GB can be high.
  • Only publishes under CC0 license waiver (public domain, waives copyright), which is appropriate for most data but not all instances or not for other research outputs.
  • No options for restricted access.
  • Fewer integrations with third-party apps compared to other generalists (e.g., no GitHub integration).

Figshare

General attributes

  • Launched: 2011
  • Maximum file size: ~5,000 GB (5 TB)
  • Maximum deposit size: 10,000 GB (10 TB)
  • Cost: Free up to 20 GB, escalator starting at $875 for 20-250 GB through Figshare+
  • Organizational structure: Commercial, funded by Digital Science, a for-profit platform under the same ownership as Springer Nature
  • Based in: United Kingdom
  • Websitehttps://figshare.com/

Pros

  • Highest individual file and total deposit size limit among generalist repositories.
  • Web interface provides file previews for many common tabular, image, and text formats.
  • Supports minting a new DOI for each new version of a deposit.
  • Accepts practically any research output (datasets, software, conference slides/posters, appendices).

Cons

  • Organizational structure and operations are relatively opaque (e.g., closed-source software).
  • Funded by for-profit entity, even if Figshare does not charge for publishing small deposits or for accessing deposits.
  • Relatively limited metadata (e.g., does not allow authors to enter affiliations).

Mendeley Data

General attributes

  • Launched: 2015
  • Maximum file size: 10 GB
  • Maximum deposit size: 10 GB
  • Cost: Free
  • Organizational structure: Commercial, Mendeley is owned by Elsevier
  • Based in: United Kingdom
  • Websitehttps://data.mendeley.com/

Pros

  • Integrated with other Elsevier products like the Mendeley reference manager and ScienceDirect.

Cons

  • Lowest individual deposit size limit among generalists that are free to publish in.
  • Intended primarily for research data and not other outputs.
  • Curated submission can take several days for staff to review dataset and return it for edits or publish it; DOI will not be active until approved by staff.
  • Owned and operated by Elsevier, a for-profit information analytics company that nets one of the highest profit rates among academic publishers.

Open Science Framework

General attributes

  • Launched: 2013
  • Maximum file size: 5 GB
  • Maximum deposit size: 50 GB
  • Cost: Free
  • Organizational structure: Non-profit, managed by the Center for Open Science
  • Based in: United States
  • Websitehttps://osf.io/

Pros

  • Integrated with other COS products like preregistration and preprints.
  • Web interface supports more functionality than many other repositories with respect to supporting information (e.g., Wiki, detailed records of modifications to metadata or data).
  • Modular nature conducive for broad-scale collaboration and organization of diverse set of materials.

Cons

  • Relatively low individual file size limit.
  • Does not support or require a similar level of metadata compared to other generalists.
  • Expanded functionality can be confusing or lead to suboptimal data management/publishing for researchers who have not used the platform before (more unique interface and process).
  • Files connected to OSF deposit through third-party apps (e.g., GitHub, Dropbox) are not actually hosted on OSF servers (can cause problems if third-party source is disconnected or deleted, even if DOI is minted on OSF).
  • Designed more for 'projects' than for discretized 'datasets' or other types of deposits; can create challenges for metadata and licensing.

Zenodo

General attributes

  • Launched: 2013
  • Maximum file size: 50 GB
  • Maximum deposit size: 50 GB (can request additional space)
  • Cost: Free
  • Organizational structure: Non-profit, managed by CERN and funded by the EU
  • Based in: Switzerland
  • Websitehttps://zenodo.org/

Pros

  • Popular repository for depositing software because of GitHub integration (each release automatically creates a new version on Zenodo); mints DOIs for each version as well as a 'parent' that always resolves to latest version; supports wide range of copyright licenses, including extensive list of software licenses.
  • Suitable for wide range of scholarly outputs (data, software, preprints, white papers).
  • Highly sustainable funding model.
  • Supports uploader-mediated restricted access.
  • Relatively high individual file size limit.

Cons

  • Only accepts up to 100 files per deposit.
  • Relatively low individual deposit size limit; unclear how requests for additional storage are handled.
  • Relatively small team, can be difficult to get quick customer support due to staffing and timezone differences.

Other generalist repositories

You may have heard of some other generalist repositories or seen other researchers use them. We are not necessarily discouraging you from using these if you have experience with them, but there are some limitations or shortcomings to consider for each of them:

  • Harvard Dataverse: Harvard Dataverse is Harvard University's specific instance of Dataverse (i.e. their institutional repository), but is open to any researcher, unlike most institutional repositories. Because the Texas Data Repository is built on the same Dataverse software as Harvard's, UT researchers should opt for TDR since the functionality will be the same, and you can get faster, more personalized support from our team.
  • IEEE Dataport: Dataport is a generalist that leans towards engineering data. Its main downside is that it operates like a hybrid journal - uploaders are not charged to deposit data in the default option, but anyone wishing to access the data must have a $40/month subscription (included in IEEE society membership) or belong to an institution that is a member (UT is not). Alternatively, uploaders can make their data freely available to anyone, but they have to pay a $1,950 data publishing charge (DPC).
  • Science Data Bank (ScienceDB)ScienceDB is similar to Dryad and is based in China, with funding from a national agency. It is sometimes listed by major publishers (e.g., Scientific Data), but it is almost exclusively used by China-based researchers, and UT researchers should be mindful of any concerns that could arise from publishing data in this platform due to geopolitical relations and review the university's export control guidance.
  • Synapse: Synapse is a project-based platform (like OSF) but mainly for health data. It has a free tier, but this does not provide access to any technical assistance; paid plans start at $15,000 USD.
  • Vivli: Vivli is part of GREI even though it is specifically for clinical data (i.e. only a generalist for biomedical researchers). For researchers who are not part of an institution that is a member (UT Austin is not), the base cost to share data is $2,500 USD.

Need more info?

Profile Photo
Bryan Gee
he/him

Example use cases

The following are some examples of common use cases / researcher needs that point to a recommended repository or set of repositories. Harvard Dataverse is not listed here because any use case for which it is a good solution should use the Texas Data Repository.

  • Free deposits: Figshare, Mendeley Data, OSF, Zenodo
  • Non-profit repositories: Dryad (standalone), OSF (Center for Open Science), Zenodo (CERN)
  • Unrestricted access to data: Dryad, Figshare, Mendeley Data, OSF, Zenodo
  • Anonymized sharing link for review: Dryad, Figshare, OSF
  • Large datasets (and you have funds): Dryad (up to 300 GB per deposit), Figshare (up to 10 TB per deposit)
  • Large datasets (and you don't have funds): Zenodo (50 GB file limit, 50 GB deposit limit)
  • Want a quality check on your data: Dryad, Figshare+, Mendeley Data (or email Bryan Gee)
  • Getting a DOI for software: Zenodo (popular GitHub integration; supports more software licenses than other platforms; mints a new DOI for each version)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.