matchms - processing and similarity evaluation of mass spectrometry data

Huber, Florian; Verhoeven, Stefan; Meijer, Christiaan; Spreeuw, Hanno; Villanueva Castilla, Efraín Manuel; Geng, Cunliang; Hooft, J.J.J. van der; Rogers, Simon; Belloum, Adam; Diblen, Faruk; spaaks, Jurriaan H.


Mass spectrometry data is at the heart of numerous applications in the biomedical and lifesciences. With growing use of high-throughput techniques, researchers need to analyze largerand more complex datasets. In particular through joint effort in the research community,fragmentation mass spectrometry datasets are growing in size and number. Platforms such asMassBank (Horai et al., 2010), GNPS (Wang et al., 2016) or MetaboLights (Haug et al., 2020)serve as an open-access hub for sharing of raw, processed, or annotated fragmentation massspectrometry data. Without suitable tools, however, exploitation of such datasets remainsoverly challenging. In particular, large collected datasets contain data acquired using differentinstruments and measurement conditions, and can further contain a significant fraction ofinconsistent, wrongly labeled, or incorrect metadata (annotations).