Emergence of SARS-CoV-2 through recombination and strong purifying selection
The origin of SARS-CoV-2, causing agent of the current pandemic we are experiencing, remains a mystery. Partly because our fundamental knowledge on the natural diversity of coronaviruses is still limited. The origin of a pathogen can be revealed by comparing the information contained in its genome with that of other known genomes. This article shows that the SARS-COV-2 genome not only has 96.3% of genetic resemblance to the coronavirus isolated from bats in 2013 named RaTG13, but also has a region that is genetically similar to the coronavirus found in pangolins. That is to say, the covid-19 agent found in humans appears to be a hybrid.
One of the signs of hybridization, which in genetics is called recombination, occurs in the area of the genome that codes for the spike protein – the virus is named after this component of the crown which is also the protein that, by binding to the ACE2 receptor, allows the virus to enter human cells and infect them. The recombination signal is located specifically in the zone of the genome that codes for the binding region of the spike protein to the receptor. In this area, the SARS-COV-2 genome is no longer similar to the virus present in bats, resembling to the ones in pangolins. Genetic signatures like this reveal that the evolution of the ancestors of SARS-COV-2 is marked by a lot of recombination, and that recombination events in the spike gene can lead to considerable changes in its ability to bind to our cells, as shown by structural biology modelling techniques predictions. On the other hand, the level of variation caused by mutations in this gene indicates that the variation is restricted, which is compatible with the importance of the spike gene and other regions of the genome for the biology of the virus. We know that when a gene loses its function it can undergo all kinds of changes without effect and the observed diversity is not constrained. Therefore, the restriction of the diversity of the spike gene means that this gene is subject to a strong selection.
In conclusion, this article on comparative genomics, along with previous literature on coronaviruses and more recent publications that have assessed the function of various forms of spike that these viruses show in their natural diversity, suggests that the recombination and selection that purifies viruses from disadvantageous mutations are two common processes in its evolutionary history. These processes must be taken into account when considering the origin and evolution of the current pandemic and the prevention of future epidemics.
Contributing scientist: Isabel Gordo
Translation: Joana Saraiva