Data archiving after research

Archiving research data facilitates the re-use and verification of research results. By depositing a dataset in a data repository, it is not only protected against corruption and loss, but also becomes findable and citable via a DOI.

WUR policy stipulates that all datasets underlying a publication must be archived for at least 10 years. The following WUR-supported archiving solutions may be used:

Open / Internal data* Confidential / Secret data*
National Data Archives: DANS-EASY and 4TU Centre for Research Data ** A data archive for secret data is in development. Until then, the W:-drive can be used.
Discipline-specific archive (WUR-approved) ***
Journal-associated archive (WUR-approved) ***

*  Definitions of confidentiality class can be found here.
** WUR Library is the front office for DANS-EASY and 4TU, and can help with archiving datasets with these repositories.
*** For more information on what repository you can use, read the section ‘What data repository can I use?’ at the bottom of this page.

What is a dataset?

A dataset is a set of files containing research data and documentation sufficient to make the data re-useable. 

What should a dataset include?

A dataset should comprise all files and documentation necessary to verify and/or reproduce the research. This includes:

  • All data files (raw data, processed data, code etc.) used in the data collection, processing and analysis. Files that are irrelevant to verification and/or reproduction or that are too large for archiving can be excluded.
  • Sufficient documentation to understand the production, provenance, processing, interpretation and relationships between the data files.

What data repository can I use?

Besides depositing your data set in the WUR-supported repositories DANS-EASY and 4TU Centre for Research Data, you could deposit it in a repository that is used within your discipline and/or recommended by a journal. Data Management Support assesses data repositories to ensure that they offer secure, durable storage and that they make data sets easily findable. This assessment focuses on a set of criteria, such as if the repository provides metadata, if it scans data files for corruption, and if it assigns persistent identifiers to data sets (see this paper for more details on the criteria). If a repository has been approved, it is recommended for data archiving as stipulated in the WUR data policy.

The table below shows the repositories approved so far. This list will be regularly updated. Do you have a repository you would like to get assessed? Please contact Data Management Support.

Repository Discipline Associated journal(s) or publishers(s)
DANS-EASY Multidisciplinary None known
4TU Centre for Research Data Technical sciences None known
Dryad Multidisciplinary (focus on life sciences) Hundreds of journals offer integrated data submission with Dryad: browse the list.
Figshare Multidisciplinary Many publishers have a partnership with Figshare, including Springer Nature, PLOS, and Wiley.
Harvard Dataverse Multidisciplinary (focus on social sciences) Various publishers recommend Harvard Dataverse, and some journals have set up their own dataverse.
Zenodo Multidisciplinary No partnerships known, but recommended by many publishers. Zenodo is also integrated with GitHub to archive Git Repositories
Mendeley Data Multidisciplinary Integrated into the workflow of Elsevier journals
Pangaea Earth & Environmental Science No partnerships or integrations known, but recommended as the standard repository in the discipline by various publishers.
ISRIC WDC-Soils Soil sciences, Geosciences None known
GBIF/NLBIF Biology, Biodiversity None known, but recommended by publishers including PLOS and Springer Nature.
NCBI: Genbank Biology, Genetics No partnerships known, but the use of Genbank is encouraged by many publishers. Examples are PLOS, Springer Nature and Elsevier.
EMBL-EBI: ArrayExpress, ENA, BioStudies, PRIDE, BioModels, IntAct, MetaboLights Biology, Genetics, Bioinformatics EMBL-EBI repositories are often recommended by publishers. Examples are PLOS, Springer Nature and Elsevier.

More information: Publishing your dataset in a repository