Genetic and genomic data of CGN accessions

- dr.ir. TG (Erik) Wijnker
- Researcher genebank methodologies
The availability of genetic and genomic data for plant material is increasing rapidly, including for material held in the CGN collections. These data play an increasingly important role in research and plant breeding. For this reason, CGN has recently inventoried which accessions have publicly available genetic data and summarized this information in downloadable overviews for its users.
For decades, the CGN collections have been an important resource for genetic research. By comparing DNA from wild species across their entire distribution range with that of cultivated varieties worldwide, researchers can gain valuable insights into the breeding history of crops. Questions such as “Where did domestication take place?”, “How did this crop spread across the world?” and “Where is genetic diversity greatest?” can be addressed in this way. CGN collections are particularly suitable for such studies because they contain both cultivated forms and wild relatives from the full geographic distribution of the species.
Genetic data are also of direct value for plant breeding. When it is known which genetic markers are associated with traits such as disease resistance or flowering time, seedlings can be selected for these traits at an early stage. Complete genome sequences additionally provide insight into the genetic basis of traits, further accelerating the development of new varieties.
Compiling an inventory of available datasets was not straightforward. Genetic data are usually made available through scientific publications, but there is no standardized way in which these data are presented. Studies differ considerably in how accessions are identified, which means that CGN accession numbers can sometimes be difficult to trace or may be missing altogether. Thanks to the expertise within the crop teams and active contact with users, a reliable overview was nevertheless compiled.
For each crop, downloadable Excel overviews have been prepared, indicating which accessions were used in which studies and what type of data are available (for example SNP data or whole-genome sequencing data). Availability varies greatly among crops: for spinach, whole-genome sequences are available for only two accessions, whereas for lettuce 519 accessions have already been sequenced. At present, overviews are available for six crops: lettuce, spinach, melon, tomato, sweet pepper and eggplant. Files for rocket (arugula) and potato will be added soon. These overviews can be found via the CGN website for data files by selecting the desired crop and downloading the corresponding file.
In addition, users can now directly select all accessions for which genetic and/or sequencing data are available in the CGN search and ordering system. When searching the catalogue for accessions of, for example, melon and ticking the box for “sequencing data,” users immediately obtain the 24 accessions for which sequencing data are available. Via the “description” field in the selection panel, users can also find the link to the corresponding Excel overview.
Although the number of genomic datasets is currently still limited, this is expected to increase substantially in the coming years. Sequence data will then become an increasingly important source of information, both as a complement to and as a means of verifying passport data. For now, the overviews primarily provide clarity for users and offer CGN a systematic way to make genetic information accessible. Suggestions for further improving usability are very welcome.
Contact
Do you have a question about this topic? Contact our expert.