A community-based initiative standardizing links between genomic and metabolomics data
Publication date: March 2021
How to use?
The Paired Omics Data Platform is a community-based initiative standardizing links between genomic and metabolomics data in a computer readable format to further the field of natural products discovery. The goals are to link molecules to their producers, find large scale genome-metabolome associations, use genomic data to assist in structural elucidation of molecules, and provide a centralized database for paired datasets.
Metabolism is the process by which cells in humans, as well as plants, animals and other organisms, produce metabolites such as amino acids, carbohydrates and hormones. These metabolites are invaluable. A new, global data platform set up by Wageningen University & Research will make it much easier for researchers worldwide to link genetic and other information about metabolites, the organisms that produce them ánd the process of metabolism. This will pave the way for faster identification of currently unknown metabolites that, for example, possess antibiotic or antiviral capacities, or contribute to increased crop yields.
The Platform relies on publicly available data deposited in standard databases and promotes FAIR principles. Metabolomics project information stored in MASSive or MetaboLights can be quickly linked to public genomes stored in NCBI or JGI. Coupling these data with minimal experimental details in a computer readable format allows for large-scale correlations of metabolome changes. Linking these publically available data will facilitate new discoveries and algorithms for predicting chemical structures from genomic information.
The platform has been created in cooperation with researchers from the Netherlands eScience Center and the University of California San Diego (USA). Together, they have set up a consortium of more than one hundred scientists from more than ten different countries. All these scientists provided feedback on the content of the platform, and filled it with more than 4800 linked genomic and metabolic datasets from various organisms and microbial communities. The platform is already being used to develop new algorithms to automatically link gene clusters and metabolite spectra.