CoVizu is an open source project endeavouring to visualize the global diversity of SARS-CoV-2 genomes, which are provided by the GISAID Initiative.
This web page provides two interactive visualizations of these data. On the left, it displays a phylogenetic tree summarizing the evolutionary relationships among different SARS-CoV-2 lineages (groupings of viruses with similar genomes, useful for linking outbreaks in different places; Rambaut et al. 2020). You can navigate between different lineages by clicking on their respective boxes.
Selecting a lineage displays a "beadplot" visualization in the centre of the page. Each horizontal line represents one or more samples of SARS-CoV-2 that share the same genome sequence. Beads along the line represent the dates that this variant was sampled.
For more help, click on the 🔰icons or have a look at the quick tour.
A phylogenetic tree is a model of how different populations are related by common ancestors. The tree displayed here (generated by TreeTime v0.8.0) summarizes the common ancestry of different SARS-CoV-2 lineages, which are pre-defined groupings of viruses based on genome similarity.
A time scale is drawn above the tree marked with dates. The earliest ancestor (root) is drawn on the left, and the most recent observed descendants are on the right. We estimate the dates of common ancestors by comparing the sampled genomes and assuming a constant rate of evolution.
For each lineage, we draw a rectangle to summarize the range of sample collection dates, and colour it according to the geographic region it was sampled most often. To explore the samples within a lineage, click on the label (e.g., "B.4") or the rectangle to retrieve the associated beadplot.
We use beadplots to visualize the different variants of SARS-CoV-2 within a lineage, where and when they have been sampled, and how they are related to each other. Every object in the beadplot has additional info in a tooltip (which you view by hovering over that object with your mouse pointer).
Each horizontal line segment represents a variant – viruses with identical genomes. We draw beads along a line to indicate when that variant was sampled. If there are no beads on the line and it is grey, then it is an unsampled variant: two or more sampled variants descend from an ancestral variant that has not been directly observed.
The area of the bead is scaled in proportion to the number of times the variant was sampled that day. This is important for rapid or intensively-sampled epidemics, e.g., lineage D.2 in Australia. Beads are coloured with respect to the most common geographic region of the samples.
We draw vertical line segments to connects variants to their common ancestors. These relationships are estimated by the neighbor-joining method using RapidNJ. Tooltips for each edge report the number of genetic differences (mutations) between ancestor and descendant as the "genomic distance". Since it's difficult to reconstruct exactly when these mutations occurred, we simply map each line to when the first sample was collected.
Since there is an overwhelming number of sampled infections that we are trying to visualize here, we have built a basic search interface that you can interact with using the inputs at the top of this web page.
You can use the text box to find a specific sample by GISAID accession number. If you start to enter an accession number, the text box will display a number of possibilities (autocompletion). You can also search samples by substring (case-sensitive). For example, searching for "Madaga" (hit enter to submit) will jump to the first lineage that contains a sample from Madagascar.
Use the "Previous" and "Next" buttons to iterate through your search results, and the "Clear" button to reset the search interface.
We would like to thank the GISAID Initiative and are grateful to all of the data contributors, i.e. the Authors, the Originating laboratories responsible for obtaining the specimens, and the Submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based.
Elbe, S., and Buckland-Merrett, G. (2017)
Data, disease and diplomacy: GISAID’s innovative contribution to global health.
Global Challenges, 1:33-46.
DOI: 10.1002/gch2.1018
PMCID: 31565258
Note: When using results from these analyses in your manuscript, ensure that you also acknowledge the Contributors of data, i.e. “We gratefully acknowledge all the Authors, the Originating laboratories responsible for obtaining the specimens, and the Submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based.”
Also, cite the following reference:
Shu, Y., McCauley, J. (2017) GISAID: From vision to reality. EuroSurveillance, 22(13)
DOI: 10.2807/1560-7917.ES.2017.22.13.30494
PMCID: PMC5388101