PanVA: Variant Analysis within Pangenomes

Astrid van den Brandt; Eef Jonkheer; Dirk-Jan van Workum; Huub van de Wetering; Sandra Smit; Anna Vilanova

doi:10.36227/techrxiv.21572433.v1

loading page

PanVA: Variant Analysis within Pangenomes

Astrid van den Brandt ,
Eef Jonkheer ,
Dirk-Jan van Workum ,
Huub van de Wetering ,
Sandra Smit ,
Anna Vilanova

Abstract

Studying genetic variation underlying phenotypes is an important topic in genomics. In plant genomic research, for example, scientists analyze the variation between cultivars and wild types to develop crops with improved resistance to diseases. This analysis is commonly based on comparison to a single reference genome. Because the number of genomes is growing rapidly and to avoid bias towards a single reference genome, the field is shifting towards the use of pangenomes, i.e., abstract representations of multiple genomes in a species or population. While pangenomes allow for a more complete picture of the genetic variation, their large size and complex data structure hinder analysis. To deal with this, genome scientists need visual analytics tools that support interactive and exploratory analysis of pangenomes to identify relevant information for variant analysis. A major challenge is to handle multiple references together with providing the adequate context of heterogeneous (meta)data, such as annotations, evolutionary relationships, and phenotypes. To address this challenge, we developed PanVA, a visual analytics design for variant analysis in pangenomes. PanVA supports a novel strategy for pangenomic variant analysis that was designed with the active participation of genomics researchers. PanVA uniquely allows researchers to get a complete picture of the variation within genes in a large set of genomes, and identify associations with phenotypes. The design combines tailored visual representations with interactions such as sorting, grouping and aggregation, allowing the user to navigate and explore different perspectives. The realization of the PanVA design is possible through PanTools. Through user evaluation in the context of plants and pathogen research, we demonstrate that PanVA helps researchers explore regions of interest and generate hypotheses about genetic variants and their role in phenotypic variation.

2023Published in IEEE Transactions on Visualization and Computer Graphics on pages 1-15. 10.1109/TVCG.2023.3282364

Abstract

Peer review status:Published