Improved computation of base generation allele frequencies

Published on
January 31, 2019

The GLS method with sparse matrices, is very fast and highly accurate, for estimating base generation allele frequencies compared to the current method. This is what researchers from Animal Breeding & Genomics conclude on the basis of their research into genomic prediction. Therefore they recommend this method be adopted in practice.

A need for allele frequencies

Genomic prediction is the process that computes breeding values using information of alleles across the genome. In genomic prediction, computed allele frequencies are used for several processes. Many aspects of genomic prediction make an assumption that the allele frequencies are equal to the frequencies in the pedigree founders or “base generation”. In practice, the current method calculates allele frequencies from the genotypes available for the current population, mostly because this is convenient and computationally trivial. We compared the current method, with two methods to compute the base generation allele frequencies, in terms of the obtained frequencies and the computing time.

Estimating allele frequencies

We simulated a typical dairy population for which we knew the base generation allele frequencies. The base generation allele frequencies were estimated using the current method, using Best Linear Unbiased Prediction (BLUP), or with Generalized Least Squares (GLS) either with dense or sparse matrices. We then compared the accuracy of the estimated allele frequencies with the known frequencies, and the computational requirements.

Results are promising for application

We showed that the current method is very fast but is the least accurate. The BLUP method was relatively inefficient but highly accurate. The GLS method with dense matrices was extremely inefficient but was highly accurate. The GLS method with sparse matrices was as efficient as the current method and highly accurate. We recommend base generation allele frequencies to be used in genomic prediction are estimated using the GLS method with sparse matrices.