Chemically informed analyses of metabolomics mass spectrometry data with Qemistree

Tripathi, Anupriya; Vázquez-Baeza, Yoshiki; Gauglitz, Julia M.; Wang, Mingxun; Dührkop, Kai; Nothias-Esposito, Mélissa; Acharya, Deepa D.; Ernst, Madeleine; Hooft, Justin J.J. van der; Zhu, Qiyun; McDonald, Daniel; Brejnrod, Asker D.; Gonzalez, Antonio; Handelsman, Jo; Fleischauer, Markus; Ludwig, Marcus; Böcker, Sebastian; Nothias, Louis Félix; Knight, Rob; Dorrestein, Pieter C.


Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on the hierarchical organization of molecular fingerprints predicted from fragmentation spectra. Qemistree allows mass spectrometry data to be represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools that are designed to analyze and visualize the relatedness of DNA sequences to metabolomics data. Here we demonstrate the use of tree-guided data exploration tools to compare metabolomics samples across different experimental conditions such as chromatographic shifts. Additionally, we leverage a tree representation to visualize chemical diversity in a heterogeneous collection of samples. The Qemistree software pipeline is freely available to the microbiome and metabolomics communities in the form of a QIIME2 plugin, and a global natural products social molecular networking workflow. [Figure not available: see fulltext.]