Data Infrastructure
Compare data infrastructure with a complex transportation network. For dependable transportation, you need a decent car (hardware), quality fuel (software) and clear traffic regulations (international guidelines). And, should you cross borders -both in your own region (WUR) and beyond (national or international) you need to consult with others how to ‘park’ your data and deal with innovations.
The Wageningen Data Competence Center (WDCC) infrastructure department offers colleagues help in finding the correct hardware and programmes, has insight into the most recent (international) developments and unites different fields of expertise together in communities.
Proper infrastructure is a requirement for ambitious research
A proper data infrastructure is an absolute prerequisite for ambitious research and education. It starts with state-of-the-art tools: from laptops to intelligent software. And it requires a broad perspective. After all, research nowadays surpasses the borders of disciplines and countries. The infrastructure experts of the WDCC know precisely what partners (such as the national IR platform for science SURF) and international collaborations (such as the European Open Science Cloud) are valuable to WUR. The fundamental principle is that knowledge institutes such as universities and research institutes make as much of their research results as possible publicly available through open-source programmes (online freely available software) and open data (data storage that is accessible to all). The guiding principle is: as open as possible, as closed as necessary. Data from privacy-sensitive research can remain protected.
Linking data leads to an ever-increasing global source of publicly accessible data: big data. International guidelines have been established to ensure responsible data management, such as the FAIR and FACT principles. The WDCC ensures that WUR research meets these requirements and tests this using practical examples, known as use cases.
FAIR: Findable-Accessible-Interoperable-Reusable
Data and metadata (information on the research circumstances such as what equipment was used) must be easy to find for other researchers and computer systems. The data sets must also be easy to link and remain accessible to future researchers.
FACT: Fairness-Accuracy-Confidentiality-Transparency
Conclusions based on big data must be based on accuracy, be fair and uniformly interpretable. Confidential data must remain confidential.
A single WUR-approach
Added to the WDCC goal of meeting international standards, the centre of expertise strives for a uniform approach to data management within WUR. Thus, a centralised system of data storage is to be introduced, based on the online programmes iRods and Yoda. The advantage is that data is stored safely for now and later, unlike storage on flash drives and external hard disks. But also: allows departments to use each other’s data, which leads to collaboration within WUR, but also with other (national and international) partners.
Learning from best practices within WUR
The WDCC aims to develop communities within WUR that share knowledge, collaborate to solve problems and generate innovative ideas. Crucial in this process are the Special Interest Groups, of which two are currently active: one on 5G and mobile data, and one on Artificial Intelligence and Machine Learning. Numerous groups within WUR analyse images: from video footage of wildlife to satellite images of farmlands. If just one department knows how to teach a computer to autonomously analyse images, they all benefit.
Read more about it in the interview with Erik van den Bergh.