Visual Exploration of Genetic Sequence Variants in Pangenomes

Brandt, A. van den; Jonkheer, E.M.; Workum, F.J.M. van; Smit, S.; Vilanova, A.


To study the genetic sequence variation underlying traits of interest, the field of comparative genomics is moving away from analyses with single reference genomes to pangenomes; abstract representations of multiple genomes in a species or population. Pangenomes are beneficial because they represent a diverse set of genetic material and therefore avoid bias towards a single reference. While pangenomes allow for a complete map of the genetic variation, their large size and complex data structure hinder contextualization and interpretation of analysis results. Current visualization strategies fall short because they are created for single references or do not illustrate links to metadata. We present a work in progress version of a novel visual analytics strategy for pangenomic variant analysis. Our strategy is designed through an intensive involvement of genome scientists. The current design uniquely exploits interactive sorting, aggregation, and linkage relations from different perspectives of the data, to help the genome scientists explore and evaluate variant-trait associations in the context of multiple references and metadata.