Genomics 4.0 : syntenic gene and genome duplication drives diversification of plant secondary metabolism and innate immunity in flowering plants : advanced pattern analytics in duplicate genomes

Hofberger, J.A.


Johannes A. Hofberger1, 2, 3

1 Biosystematics Group, Wageningen University & Research Center, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands (August 2012 – December 2013)

2 Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands (December 2010 – July 2012)

3 Chinese Academy of Sciences/Max Planck Partner Institute for Computational Biology, 320 Yueyang Road,

Shanghai 200031, PR China (January 2014 – December 2014)


Large-scale comparative analysis of Big Data from next generation sequencing provides powerful means to exploit the potential of nature in context of plant breeding and biotechnology. In this thesis, we combine various computational methods for genome-wide identification of gene families involved in (a) plant innate immunity and (a) biosynthesis of defense-related plant secondary metabolites across 21 species, assess dynamics that affected evolution of underlying traits during 250 Million Years of flowering plant radiation and provide data on more than 4500 loci that can underpin crop improvement for future food and live quality.


As sessile organisms, plants are permanently exposed to a plethora of potentially harmful microbes and other pests. The surprising resilience to infections observed in successful lineages is due to a complex defense network fighting off invading pathogens. Within this network, a sophisticated plant innate immune system is accompanied by a multitude of specialized biosynthetic pathways that generate more than 200,000 secondary metabolites with ecological, agricultural, energy and medicinal importance. The rapid diversification of associated genes was accompanied by a series of duplication events in virtually all plant species, including local duplication of short sequences as well as multiplication of all chromosomes due to meiotic errors (plant polyploidy). In a comparative genomics approach, we combined several bioinformatics techniques for large-scale identification of multi-domain and multi-gene families that are involved in plant innate immunity or defense-related secondary metabolite pathways across 21 representative flowering plant genomes. We introduced a framework to trace back duplicate gene copies to distinct ancient duplication events, thereby unravelling a differential impact of gene and genome duplication to molecular evolution of target genes. Comparing the genomic context among homologs within and between species in a phylogenomics perspective, we discovered orthologs conserved within genomic regions that remained structurally immobile during flowering plant radiation. In summary, we described a complex interplay of gene and genome duplication that increased genetic versatility of disease resistance and secondary metabolite pathways, thereby expanding the playground for functional diversification and thus plant trait innovation and success. Our findings give fascinating insights to evolution across lineages and can underpin crop improvement for food, fiber and biofuels production