This article is part of an ongoing series investigating the link between chronic infections and the emergence of new SARS-CoV-2 variants. 

In the spring of 2022, a group of researchers at the Yale School of Public Health happened upon a mystery: their genomic surveillance dataset was picking up a SARS-CoV-2 lineage, B.1.517, thought to have gone extinct in the United States and globally as far back as April of 2021. How had this supposedly dead strain made its way into their database?

The researchers traced the sequences to an immunocompromised individual in Connecticut, USA. They discovered that this patient, who was battling lymphoma and had previously undergone a stem cell transplantation, was suffering from a chronic SARS-CoV-2 infection — more than 470 days after initially contracting Covid-19, the virus continued to circulate throughout their body.

As part of the Yale SARS-CoV-2 Genomic Surveillance Initiative —which was established with the emergence of the Alpha variant— 30 nasal swabs had been collected from the immunocompromised patient between February 2021 and March 2022. This allowed Chaguza et al. to derive whole-genome sequences of the virus, giving them a sense of how it evolved over the course of the infection. Their findings are published on the preprint server medRvix, and offer deep insights into the potential origins of SARS-CoV-2 variants.

Although rare within the global population as a whole, chronic infections of this type are not uncommon amongst the immunocompromised community. Often, the immune systems of such patients cannot clear the virus and end the infection, meaning the virus continues to replicate for weeks, months, or even years. The danger? As the virus remains within the body, it has the time to adapt to the weakened immune system of the host, developing new mutations to equip itself against host defenses better.

This is precisely what Chaguza and colleagues saw unfold in the Connecticut patient. They tested a subset of 12 nasal swabs for viral load — the amount of virus present in the body— and found that the individual had high levels of infectious viral copies for nearly the entirety of their infection (Figure 1). Not only does this confirm long-term viral replication, it also suggests the patient may transmit the virus to others for the duration of their infection. Added to this risk is the fact that, barring a first week of mild respiratory symptoms, the patient remained asymptomatic. This means immunocompromised patients could very easily remain unaware of their chronic SARS-CoV-2 infection, continue interacting with others as usual, and transmit the virus all the while.

Timeline of chronic infection
FIGURE 1. (A) Timeline showing clinical history of the patient from the earliest time they tested negative for SARS-CoV-2, the first positive test following household exposure by a symptomatic household contact who tested positive two days prior, until the last sampling point. Note that collection of samples was stopped due to the deteriorating condition of the patient, but the infection had not yet cleared. (B) Nasal swab RT-PCR cycle threshold (Ct) values for the samples available for whole genome sequencing showing high viral RNA copy numbers. Additionally, virus infectivity assays performed for selected samples revealed infectious virus at most sampling points. FROM: “ACCELERATED SARS-COV-2 INTRAHOST EVOLUTION LEADING TO DISTINCT GENOTYPES DURING CHRONIC INFECTION” CHAGUZA ET AL. 2022

A second crucial finding of the study is that the evolutionary rate of the virus, which describes the speed at which it mutates, proved to be significantly higher in the immunocompromised patient than in the general population — roughly twice as fast as the average global SARS-CoV-2 evolutionary rate.

Predictably, this goes hand in hand with the emergence of new variants. The researchers witnessed the formation of three distinct viral genotypes over the duration of the chronic infection, all of which had upwards of ten amino acid mutations (Figure 2). These persisted for extended periods of time, implying they were advantageous to viral fitness, likely improving immune escape. Some of the mutations that kept popping up had already been seen in other variants of concern; a spike protein substitution, E484K, seen in the Beta, Gamma, Eta, Iota, and Mu variants, for example. This points towards a third crucial finding: a single chronic infection can give rise to multiple unique variants.

Phylogenetic tree of the three distinct genotypes that emerged during infection
FIGURE 2. (C) Time-resolved phylogeny of the chronic infection samples with branch lengths scaled by the number of days since the first positive RT-PCR SARS-CoV-2 test. (D) Maximum-likelihood phylogeny of the chronic B.1.517 samples showing branch lengths scaled by the genetic divergence expressed as the number of accrued substitutions over time. The phylogeny shows the intrahost emergence and persistence of multiple divergent genotypes. FROM: CHAGUZA ET AL. 2022

Curiously, the three genotypes remained present in the immunocompromised patient at the same time, rather than replacing one another (Figure 2). This suggests they may have been inhabiting different tissues or cells within the body, allowing them to coexist. Still, the variants would switch in and out, taking turns being the dominant genotype: genotype 1 remained dominant from day 79 to 247, followed by a volatile period during which the three genotypes frequently jumped back and forth in dominance. For example, in the span of only 100 days —day 281 to day 381— genotype 1 and genotype 2 alternated between being the dominant genotype a total of five times. During the later stages of infection genotype 3 briefly rose to dominance, only to again be replaced by genotype 2.

Although unexpected, it is not entirely unprecedented. A similar scenario was observed in a study of an immunocompromised patient in London, England. For the first 57 days of infection, there was little change to the overall structure of the SARS-CoV-2 population. Then, following treatment with a highly potent preparation of anti-SARS-CoV-2 antibodies from three different patients, a dominant viral genotype suddenly emerged. This lasted until the patient’s antibody levels began dropping again, at which point the virus with the immune escape genotype began to fade. It returned full force during a final, unsuccessful course of antibody treatment.

The simultaneous circulation of different genotypes presents the additional risk of recombination, a process whereby different viral strains exchange genetic information, creating new opportunities to overcome selective pressures. Although no recombination was seen in this particular case, we know that it is common amongst coronaviruses — especially when leaping from other animals into humans, the defining changes often happen via recombination, and SARS-CoV-2 is no exception. This is because recombination acts as a very quick way of sharing vast amounts of genetic information, helping viruses to diversify their genomes in large strides rather than small steps. Such large jumps make it extremely difficult for our immune system to keep up, with prior infection offering only little guidance.

Another surprising finding: the Spike protein did not have the highest frequency of nonsynonymous changes. Synonymous changes are minor mutations to the viral genome that do not alter the amino acid sequence of a protein — loosely, they do not impact the shape and function of a protein. Nonsynonymous changes, on the other hand, are mutations that do impact the amino acid sequence. The higher the frequency, the higher the selective pressure for that protein. Often there is high selective pressure for the genes encoding the Spike protein, since any advantageous mutations will lead to higher infectivity. But instead of Spike, the accessory protein Orf10 saw the highest frequency of nonsynonymous changes, followed by the accessory protein Orf6 and the envelope protein. Orf6 is closely linked to immune evasion, and Orf10 is closely linked to immune suppression. This suggests that mutations to these genes may convey additional advantage; this may simply be a special property of immune selection in immunocompromised persons, but it may also reflect important contributions of these genes to overall viral fitness in a broader population.

The full list of nonsynonymous changes in each of the three genotypes can be seen below, in figures 3, 4, and 5.

Comparison of the mutations between the three genotypes that emerged during infection
FIGURE 3. Mutations to the SARS-CoV-2 genome seen in genotype 1 (V1/blue), genotype 2 (V2/pink), and genotype 3 (V3/orange). “V0” refers to the genotype derived from the earliest samples, which is not quite the canonical B.1.517 lineage as it already contained a few extra mutations. SOURCE: ACCESS HEALTH INTERNATIONAL
mutations to the Spike protein
FIGURE 4. Mutations to the SARS-CoV-2 Spike protein in: genotype 1 (V1/blue), genotype 2 (V2/pink), and genotype 3 (V3/orange).  SOURCE: ACCESS HEALTH INTERNATIONAL
Venn diagram of mutations
FIGURE 5. A schematic of the overlap in mutations between the three novel genotypes found in the immunocompromised patient — genotype 1 (V1/blue), genotype 2 (V2/lilac), and genotype 3 (V3/orange).  SOURCE: ACCESS HEALTH INTERNATIONAL

Take-Aways

The study by Chaguza et al. supports the hypothesis that chronic infection of immunocompromised individuals may be one of the primary vectors for the emergence of novel, unpredictable variants. Theirs adds to a long list of studies documenting similar cases; 24 confirmed occurrences so far, but the actual number is likely to be much higher. I have analyzed a few of these, including the BostonPittsburghItaly, and Austria examples.

But, how big of a problem is this really? So far there have been 550 million confirmed Covid-19 cases. Realistically, this is a vast underestimate. The actual number likely sits somewhere between 3 and 5 billion. Even if only 1% of those are immunocompromised, that leaves us with around 30 to 50 million individuals susceptible to chronic infection and, by extension, the incubation of new variants. There are around 37.7 million people living with HIV alone, not to mention other immunocompromised communities including cancer patients, organ transplant recipients, and those suffering from autoimmune disorders.

We need to make sure that we prioritize the treatment of immunocompromised patients, helping them clear their infection as quickly as possible. We also need to double down on global SARS-CoV-2 surveillance, particularly whole-genome sequencing. Without a solid surveillance infrastructure in place, we become blind to what may be lurking in wait for us around the corner.

Both Alpha and Omicron are presumed to have come to us from chronic infections, let’s not make the same costly mistake again.