Publications

MIBiG 2.0: a repository for biosynthetic gene clusters of known functions

Kautsar, S.A.; Blin, Kai; Shaw, Simon; Navarro Munoz, J.C.; Terlouw, Barbara; van der Hooft, J.J.J.; Van Santen, Jeffrey A.; Tracanna, V.; Suarez Duran, Hernando; Pascal Andreu, V.; Selem Mojica, Nelly; Alanjary, Mohammad; Robinson, Serina; Lund, George; Epstein, Samuel C.; Sisto, Ashley C.; Charkoudian, Louise K.; Collemare, Jérôme; Linington, Roger G.; Weber, Tilmann; Medema, M.H.

Summary

Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.