SMARTBREED: Smart animal breeding with advanced machine learning

The amount of data that is accumulating over the lifespan of animals is increasing very rapidly, but the
technology to analyse it to full potential is lagging behind. We propose to investigate the applicability of
advanced machine learning methods to get insight in the variation in phenotypic patterns over time,
and in the value of adding data on the life history of an animal to the genomic information to improve
the prediction of the animal’s future phenotype.

Classical data analysis in the field of animal breeding is model driven. Milk recording systems in cattle and performance recording systems in pigs, for example, are developed with a genetic model in mind to allow for subsequent estimation of breeding values. However, with modern management systems large amounts of data become available that are not specifically designed for genetic analysis, but may nevertheless contain useful information. Hence, data-driven methods are required that are able to extract the relevant information from the data, without an explicit a priory model. In this research, therefore, we will develop data-driven methods. The machine learning methods in this project will be developed around cases related to dairy cattle and finisher pigs, but the methods will be generally applicable also to similar types of large data in other animal species. New phenotypes will be developed with a predictive value with respect to future performance, for example related to the health status of an animal.

The key objectives of this project are to:

  1. Automatically learn a model for the prediction of a given dependent variable, using a given set of independent variables (i.e. genomic information, phenotypes, environmental).
  2. Determine the value of genomic information in phenotypic prediction throughout the animal’s life
  3. Determine the probability of ending up in a predefined class
  4. Recognise and interpret deviations from the typical pattern of phenotypes in relation to the animal’s health status
  5. Evaluate the relation of deviations from predicted phenotypic patterns with longevity

Main supervisors:

  • Dr Bart Ducro, Animal Breeding and Genomics Center, Wageningen UR, (
  • Prof. Nicolai Petkov, Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen, (
  • Dr George Azzopardi, Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen, (
  • Prof. Dr Roel Veerkamp, Animal Breeding and Genomics Center, Wageningen UR, (Roel

Positions available:

  • PhD candidate (data-driven methods, statistics), primary location: Groningen
  • Postdoc (animal breeding, biology, statistics), for a period of 3 years, primary location: Wageningen, opportunity to start in 2016

Further information: please contact of one of main supervisors