When and why to publish your dataset

Data publishing is public disclosure of the research data you've collected. It is about making your research data retraceable for others for verification purposes and for making them reusable beyond the original purpose for which they were collected.

The Code of Conduct for Academic Practice requires that you retain your data for ten years after you have published your research paper and make it available upon request for verification purposes. Most funders and publishers now additionally encourage - and sometimes require - publishing the research data on which your research papers are based. Your data are made publicly available in a format that facilitates reuse.

In general, what options do you have to publish research data?

  • On your own website/server
  • As supplementary material to your research paper
    Smaller datasets and certain data types, may be uploaded as Supporting Information files with the manuscript to your publisher's website. Supplementary data is often not all data on which your research paper is based. Check whether you aren't transferring copyrights to the publisher when you upload your data files.
  • In a data repository
    Datasets and related metadata can be deposited in a public data repository (see 'Data repositories').Data Management Support can publish your datasets for you at DANS EASY or 4TU.Centre for Research Data.
  • As a description of your dataset in a data journal
    Larger datasets that may be used for another purpose than their original purpose may be suitable for a data paper. A data paper is a peer reviewed article describing openly accessible datasets for future reuse. (see this list of open data journals for examples).
    At Wageningen University & Research, Wageningen Environmental Research and Wageningen University & Research - Library are working on a new data journal for the agricultural sciences: ODJAR 'Open Data Journal for Agricultural Research'. 

Advantages of publishing your data in a data repository

Publishing your data in a data repository has several advantages:

  • Your data are kept for the long-term in accessible data formats, (see 'File formats').
  • A data license is applied, acknowledging data rights (see 'Data Licenses').
  • A data repository assigns a persistent identifier to your dataset, ensuring your data can be properly cited and linked to your publications. See: How to cite datasets and link to publications. (Please note that when you upload data to your own server or as supplementary material to your research paper, your data will not easily be citable as an independent object).
  • Your data are promoted to other users.

Sometimes, the nature of your dataset doesn't allow Open Access data publishing. The data may be confidential, or there may be privacy issues or funder constraints. If publishing your data Open Access is not ethical or legal, you still have to make your data available on request for verification purposes. You could also consider publishing your data anonymised in restricted access and allow the data to be found by describing the dataset and how it may be accessed.

Most data repositories (see 'Data repositories') allow you to place a - temporary - embargo on your data: during the embargo period the description of the dataset is published, but the data themselves are not available for reuse by others.