Student information

Tewdros did his thesis on QTN genetic value prediction

Tewdros Tsehaye Ghebremariam from Eritrea finished his EMABG thesis in summer 2011 at the Norwegian University of Life Sciences (UMB) in Aas, Norway.

Thesis title

Accuracy of prediction of QTN genetic value by surrounding SNPs from pooled genotype data


A successful implementation of genomic selection in animal breeding schemes might be hampered by high genotyping cost and unavailability of individual’s observation. Genotyping of pooled DNA has been studied as a cost effective alternative. The aim of this study was to investigate accuracy of the prediction of QTN genetic value from pooled genotype data. It also assessed the effect of heritability, minor allele frequency and number of surrounding SNPs.

Accuracy of the prediction of QTN genetic value was investigated by computer simulations.  1040 SNPs were sampled from one chromosome and SNPs with minor allele frequency (MAF) > 0.02 retained.  The trait assumed genetically affected by only one QTN. Then 12 designated QTN formed 12 replicates of the simulation.  DNA pooling was assumed by taking the frequency of the ‘1’ allele.  Six pool sizes from 200, 100, 50, 25, 10 and 5 individuals and non-pooled were set and accuracies compared between the pool sizes. The effect of surrounding SNPs number (10 -100), heritability (~l and 0.5), adjacent and associated SNPs, data set of SNPs with MAF (> 0.02 and > 0.08) were studied.  Prediction of QTN genetic value was performed by splitting the data into training and test sets and five folder cross validation performed.  Best linear unbiased prediction (BLUP) model was used to predict the effect of individual SNPs.

Prediction accuracy decreases with increasing pool size. The optimum accuracy was 0.76 from non-pooled and 100 surrounding SNPs followed by 0.68 from pool-size 5 and 60 surrounding SNPs. The effect of pooling was more or less similar across heritability and MAF.  Lower accuracies were obtained when heritability decreased to 0.5.  For more surrounding SNPs, a data set of SNPs with MAF > 0.02 in general gives higher accuracies than MAF > 0.08. Depending on the number of surrounding SNPs, higher accuracies were obtained by using adjacent SNPs. 

The results indicate that in a situation where individual’s observation is unavailable or costly, prediction of QTN genetic value thus genomic selection is possible with fairly high accuracy from pooled genotype data. 

Key words

pooling, genomics selection, genotyping, quantitative trait nucleotide, single nucleotide polymorphism, minor allele frequency, heritability