Publications

Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing

Aflitos, S.A.; Schijlen, E.G.W.M.; Jong, J.H.S.G.M. de; Ridder, D. de; Smit, S.; Finkers, H.J.; Bakker, F.T.; Geest, H.C. van de; Lintel Hekkert, B. te; Haarst, J.C. van; Smits, L.W.M.; Koops, A.J.; Sanchez-Perez, M.J.; Heusden, A.W. van; Visser, R.G.F.; Schranz, M.E.; Peters, S.A.

Summary

We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon, and Neolycopersicon groups which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment reveals group-, species-, and accession-specific polymorphisms, which explains characteristic fruit traits and growth habits in the different cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences dN/dS ratio in fruit and growth diversification genes compared to a random set of genes, pointing to positive selection and to differences in selection pressure between crop accessions and wild species. In wild species SNPs are found in excess of 10 million, i.e. 20 fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, highest levels of heterozygosity were found for allogamous SI wild species, while facultative and autogamous SC species display a lower heterozygosity level. Using whole genome SNP information for Maximum Likelihood analysis we achieved complete tree resolution, whereas ML trees based on SNPs from 10 fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat pointing at the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies.