Data archiving after research

Archiving research data facilitates the reuse and verification of research results. By depositing a data set in a data repository, it is not only protected against corruption and loss, but also becomes findable and citable via a DOI (or other persistent identifier).

The WUR research data policy stipulates that all data sets underlying a publication must be archived for at least 10 years. The following WUR-supported archiving solutions may be used:

Open / Internal data* Confidential / Secret data*
National Data Archives: DANS-EASY and 4TU Centre for Research Data ** A data archive for secret data is in development. Until then, the W:-drive can be used.
Discipline-specific archive (WUR-approved) ***
Journal-associated archive (WUR-approved) ***

*  Definitions of confidentiality class can be found here.
** WUR Library is the front office for DANS-EASY and 4TU, and can help with archiving datasets with these repositories.
*** For more information on what repository you can use, read the section ‘What data repository can I use?’ at the bottom of this page.

What is a data set?
A data set is a collection of data files that originate from the same project and/or cover the same thematic subject.

What should a data set include?
A data set should comprise all files and documentation necessary to verify and/or reproduce the research. This includes:

  • All data files (raw data, processed data, code etc.) used in the data collection, processing and analysis. Files that are irrelevant to verification and/or reproduction or that are too large for archiving can be excluded.
  • Sufficient documentation to understand the production, provenance, processing, interpretation and relationships between the data files.

What data repository can I use?
Besides depositing your data set in the WUR-supported repositories DANS-EASY, 4TU Centre for Research Data and Zenodo, you could deposit it in a repository that is used within your discipline and/or recommended by a journal. The Data Desk assesses data repositories to ensure that they offer secure, durable storage and that they make data sets easily findable. This assessment focuses on a set of criteria, such as if the repository provides metadata, if it scans data files for corruption, and if it assigns persistent identifiers to data sets (see this paper for more details on the criteria). If a repository has been approved, it is recommended for data archiving as stipulated in the WUR data policy.

The table below shows the repositories approved so far. This list will be regularly updated. Do you have a repository you would like to get assessed? Please contact the Data Desk.

Repository Discipline Associated journal(s) or publishers(s)
DANS-EASY Multidisciplinary None known
4TU Centre for Research Data Technical sciences 4TU.ResearchData is integrated with GitHub to archive Git Repositories.
Dryad Multidisciplinary (focus on life sciences) Hundreds of journals offer integrated data submission with Dryad: browse the list.
Figshare Multidisciplinary Many publishers have a partnership with Figshare, including Springer Nature, PLOS, and Wiley.
Harvard Dataverse Multidisciplinary (focus on social sciences) Various publishers recommend Harvard Dataverse, and some journals have set up their own dataverse.
Zenodo Multidisciplinary No partnerships known, but recommended by many publishers. Zenodo is also integrated with GitHub to archive Git Repositories
Mendeley Data Multidisciplinary Integrated into the workflow of Elsevier journals
Pangaea Earth & Environmental Science No partnerships or integrations known, but recommended as the standard repository in the discipline by various publishers.
ISRIC WDC-Soils Soil sciences, Geosciences None known
GBIF/NLBIF Biology, Biodiversity None known, but recommended by publishers including PLOS and Springer Nature.
NCBI: Genbank Biology, Genetics No partnerships known, but the use of Genbank is encouraged by many publishers. Examples are PLOS, Springer Nature and Elsevier.
EMBL-EBI: ArrayExpress, ENA, BioStudies, PRIDE, BioModels, IntAct, MetaboLights Biology, Genetics, Bioinformatics EMBL-EBI repositories are often recommended by publishers. Examples are PLOS, Springer Nature and Elsevier.
DataverseNL* Multidisciplanary None known

* WUR had no access to DataverseNL

More information: Publishing your dataset in a repository