Statistical Approaches

Current activities include combined data analysis of molecular markers, gene expression and metabolomics data and phenotype (e.g. disease resistance or product quality) scored on segregating populations of crosses. Methods being used are procedures such as random forest for classification or multiple regression in cases where the number of predictor variables (e.g. molecular markers, genes, metabolites) is much larger than the number of samples in which they have been measured (plants, tissues). Other areas of interest are modelling genotype x environment interaction, mapping and QTL analysis in single segregating populations, multiple populations or collections of germplasm. A specific focus area is also the development of a genetic analysis pipeline for polyploid crops.