Research data documentation

It is essential to provide documentation to research data, allowing you and others to fully understand and reuse the data now and in the future.

Detailed and clear documentation is key in order to make research data discoverable, understandable, citable and reusable. Documentation should contain:

  • information on the folder structure
  • files present and how they relate to each other
  • purpose of the files
  • file formats present
  • purpose of the research
  • explanation of all used abbreviations within file and folder names as well as within files
  • explanation of all columns used (if any)
  • explanation of data used (including units of measurement)
  • software requirements
  • steps undertaken in processing data
  • steps undertaken in analysing data
  • data collection steps
  • data creator(s) and manager(s) + affiliation(s)
  • data provenance

The most common required form of documenting research data is by adding a readme file, which is further supplemented with metadata (see below).

Metadata: machine-readable data documentation

A key form of data documentation is creating metadata, or in other words “data about data”. Metadata are characteristics describing the data, which facilitates cataloguing and discovery of the data. When depositing data into a trusted data repository, the repository generates machine-readable metadata according to fixed terms/standards/vocabularies. As such, it becomes easier to search and find documents written by, for example, a certain author. Metadata contain amongst others the title of the data, temporal and spatial coverage, creator(s) and contributor(s) with affiliation(s), terms of use, access conditions etc. See for commonly used terms the DataCite metadata scheme or Dublin Core. More background information about metadata can be found here.

WUR recommendations

As minimum data documentation, WUR strongly recommends the following:

  • A readme file: we advise to use the WUR readme file template as the minimum required documentation to add to the data. Feel free to add more documentation where appropriate and required.
  • A metadata file: we advise the use of the Yoda metadata terms which are based on the DataCite metadata standard as the minimum required metadata to add to the data. You can fill in these terms in yoda.wur.nl when you already have data in Yoda. When you don’t have data in Yoda and do not want to use Yoda, you can use the Yoda metadata editor and download the metadata as a json file. These files can be viewed in various programming software (e.g. R, Python, etc.), internet browsers (especially Firefox), and basic text editors (Notepad, Notepad++). Note that using the Yoda metadate terms is not necessary when you are already implementing discipline specific metadata standards, using a data repository with sufficient metadata implemented (e.g. 4TU.ResearchData, the DANS Data Stations), or using a discipline-specific data repository that implements their own metadata standard (e.g. NCBI).
  • A codebook: We advise to use the WUR codebook template to explain variables, abbreviations, etc. This template is in .csv format, which makes it easy to import into software such as R, Python, etc.

You can find the templates for the readme file and codebook here DOI: 10.5281/zenodo.7701727. Via this link a filled in example of a readme file, codebook and metadata file can be found as well.

ELabjournal

At WUR, it is possible to use the Electronic Lab Notebook ELabJournal, a platform to work on and organise research data, samples, and accompanying documentation (e.g. protocols, procedures, manuals, etc.). Additionally, ELabJournal provides the option to collaborate with colleagues within a department and across departments at WUR. Interested in using ELabjournal or want to know who at WUR is already using it? Have a look at this page (intranet WUR, login required). Already working with ELabjournal, but need (more) guidance? Have a look at the manual.

Support

Questions? Don't hesitate to contact data@wur.nl.