This project aims to develop improved methods and software for genetic and genomic analysis in polyploid crops based on next generation sequencing data and tailored to practical breeding situations. Highly diverse and polyploid breeding populations may contain multiple alleles at any trait locus of interest, each of which may have various positive, neutral or negative effects on the trait. Multi-allelic markers (combinations of individual SNPs i.e. haplotypes) are needed to analyze and catalogue the effects of different alleles in trait loci that are important for breeding. The use of these haplotypes has multiple benefits: they are more powerful and show the width of het genetic diversity much better.
The project consists of three parts: in the first part, we will develop methodology for determining haplotypes in a breeding pool (containing both distantly related genotypes and genotypes with direct (offspring) relationships). Using state-of-the art methods and to be developed new methods, an integrated bioinformatics and quantitative genetics approach will be followed to determine discrete and probabilistic haplotypes from sequencing data and limit the number of probable solutions by including pedigree relationships and observed allele frequencies and/or by estimating the most probable haplotype using imputation strategies from either whole genome sequencing (WGS) or reduced complexity sequencing. The estimated haplotypes will be added to an atlas of haplotypes, which will be used in future iterations of haplotyping and to improve haplotype-based predictions of traits.
Prediction of traits
In the second part, methods are developed to optimally use haplotypes to enable prediction of traits and selection in breeding programs, both for traits determined by few large-effect loci and for traits determined by a larger number of smaller effect loci. Factors influencing the accuracy of predicting the phenotype from the haplotypes will be studied using simulations where these factors can be varied at will, while an experimental dataset is used to validate the method for practical use and to guide the improvement of the haplotyping.
In the third part, a visualization tool for the detected haplotypes is developed that allows accessions from the breeding pool to be assessed by combining the predicted haplotypes in these accessions with observed phenotypes or predicted breeding values for traits. This visualization tool can be used for selection in a specific set of germplasm in a running breeding program.