Project

COMREC

COMREC is built on the hypothesis that the control of meiotic crossover (CO) frequency and distribution is dependent on the interplay between the biochemical processes of the recombination machinery, the programmed reorganisation of the chromosomes during meiotic progression and the organisation of the chromatin “landscape”. The PhD project “Bioinformatic analyses of meiotic recombination in tomato hybrids and related species” specifically focuses on a multidisciplinary approach to analyze (i) the genetic variation in recombination, (ii) CO sequence features and types, (iii) the frequency of recombination, and (iv) the synteny and structural homology involved in recombination cold and hot spots. The various types of sequence- and genome-topology data will be integrated to predict recombination “cold spots” and “hot spots”.

This Marie Curie ITN project aims to identify the factors driving homeologous recombination, which is indispensable for the development of genome-based breeding tools. Tomato (Solanum lycopersicum) can be crossed with the other wild species in the tomato clade. The homeologous chromosomes in these interspecific hybrids synapse to a great extent, except for rearrangements between some chromosomes and differences in (pericentromere) repeats. The project will focus on the effect that homoeologous sequences have on meiotic recombination. To this end we aim to analyse the recombinant sites in the progeny of the interspecific hybrids and compare these with the sites in tomato resulting from homologous recombination. The proposed research focuses on the bioinformatics analyses of DNA sequences at and near the recombination sites in tomato and interspecific hybrids using DNA sequence information obtained of S. lycopersicon x S. pennellii and S. lycopersicon x S. pimpenellifolium introgression lines. The project addresses the interplay between meiotic, molecular and genetic aspects of homoeologous recombination. The multidisciplinary approach specifically aims to analyze (i) the genetic variation in recombination, (ii) characterizing CO sequence features and types, (iii) frequency of recombination, and (iv) the synteny and structural homology involved in cold and hot spots of recombination. We will perform chromatin immunoprecipitation (ChIP) using antibodies against class I and II COs (P1) (43) to chromatin isolated from anthers from tomato, wild species of tomato (S. pennellii) and their F1, and high-throughput sequencing (ChIPSeq) with state-of-the-art 454FLX, Illumina HiSeq and PacBio sequencing platforms which are in-house available at the UoW. We will use an optimized plant-specific ChIP protocol. CO junctions will be identified by mapping the generated CO sequences onto the tomato reference genome (18) and subsequent SNP detection. For this we will rely on sequence information from wild type species and RILs (recombinant inbred lines) that have been sequenced in the 150 tomato genomes consortium (http://tomatogenome.net) of which UoW is project leader. Quantitative enrichment of sequence tags will be used to determine the genomic positions of the COs in a statistically robust manner. ITAG gene annotation from SGN (http://solgenomics.net) will be used to specifically identify COs at exons, introns, intergenic regions and other DNA elements, enabling the analysis of potential differences between these DNA elements. In-house statistical methods will be applied to measure differences in distribution and frequency between tomato lines. Comparative multiple sequence alignments (MSA) of CO junction breaks will be used to determine the amount of sequence similarity and to identify sequence features like e.g. repeat content. To provide insight into the amount of structural homology involved in cold and hot spots of recombination, we will compare cross-species BAC/PCR FISH data with genomic sequence data, genetic marker data, FISH analysis, comparative sequence alignment, and genome annotation data to reveal the chromosome organization at the micro synteny level. A machine-learning approach (SVM or conditional random field) will be applied to integrate sequence and topology data, and predict cold and hot spots of recombination.