An often overlooked aspect of SARS-CoV-2 variation is the potential impact single mutations can have on characteristics like transmission and virulence. One variant may contain a single change over another enabling it to be far more dangerous to the global population. Among the first major mutations that have manifested widely two years into the Covid-19 pandemic are the N protein mutations R203K and G204R.
These mutations were first assumed to have emerged with the Alpha variant, originally denoted B.1.1.7 in late 2020. However, analysis of sequenced viruses in the GISAID SARS-CoV-2 database shows that R203K and G204R were first observed as early as Spring 2020. Today, these two mutations are found in 4.8 million of the 10.4 million viruses in the GISAID database, indicating that about half of SARS-CoV-2 infections involve these mutations.
N Protein Mutations In Saudi Arabia
A recent study by Mourier et al. delved deeper into why these two N protein mutations are so omnipresent. Their observations centered on the kingdom of Saudi Arabia, hypothesizing that the relatively high level of population movement, specifically in reference to religious mass gatherings, could be a breeding ground for SARS-CoV-2. Their observations were conducted from March to August 2020, meaning little was known about the virus or the variants that would follow.
Mourier et al. collected 892 SARS-CoV-2 samples via nasopharyngeal swabs from patients in Jeddah, Makkah, Madinah, Riyadh, and the east coast of Saudi Arabia. They then performed phylogenetic analysis on the samples to ascertain the genetic diversity of their samples. To my knowledge, this is one of the earliest large-scale examinations of genetic diversity in collected samples during the Covid-19 pandemic.
Of the 892 samples, Mourier et al. noted 836 nucleotide modifications as compared to the Wuhan wild-type. A vast majority of these nucleotide mutations were isolated to a single sample and were generally disregarded, however, some nucleotide mutations were more frequent than others. Specifically, the researchers noted a high volume of A23403G, resulting in a mutation from aspartic acid to glycine at position 614 (D614G) in the Spike protein, and three consecutive nucleotide mutations of G28881A, G28882A, and G28883C, resulting in mutations from arginine to lysine at position 203 (R203K) and glycine to arginine at position 204 (G204R) in the N protein.
N Protein Functional Characteristics
The high prevalence of R203K and G204R initiated a closer examination of the mutations in reference to functional analysis and impact on virological characteristics.
Mourier et al. first examined the viral load implications of the N protein mutations. Using cycle threshold values obtained through PCR analysis of the samples, they found that samples including the two N protein mutations displayed a 33% increase in viral load over samples excluding the mutations. They also note that patients with severe symptoms were more often associated with viruses including R203K and G204R, implicating higher viral load as one reason for more intense Covid outcomes.
They next examined the oligomerization potential and RNA-binding affinity. Noting that the mutations were in the N protein linker region, which is involved in N protein oligomerization. Oligomerization is a key function of the N protein, binding and encoding the full-length viral RNA. Mourier et al. found samples with the mutant N protein had a higher oligomerization potential than others. As oligomerization is involved in viral RNA interactions, they used an in vitro assay to examine binding affinity. They note a significantly stronger binding affinity than non-mutant viruses, suggesting increased efficiency of N protein functions.
The N protein is multifunctional and pleiotropic. Among its many functions are import and export packaging of RNA from the nucleus, synthesis of genomic RNA, cell cycle manipulation of the host cell, and suppression of interferon responses to name a few.
To begin to explore the effect of the N protein mutations on non-packaging functions, Mourier et al. then examined the mutant N protein interactions with host proteins. Using a mass spectrometry analysis, they found that of the 43 proteins that displayed significant differential interactions with the SARS-CoV-2 N protein, 42 showed increased interaction with the mutant protein. They found that many of the 42 host proteins are involved in significant immune processes, including viral processing, regulation of RNA nuclear export, apoptosis, and immune regulation.
One characteristic that sets the N protein apart from most others is that it is phosphorylated and the degree of phosphorylation can dramatically affect its function. For this reason, Mourier et al. examine the phosphorylation of the mutant protein. They note that S206 is highly phosphorylated in the mutant N protein as compared to the wild-type N protein. We emphasize that S206 is unmutated and is only adjacent to the mutant amino acids. The authors speculate that the changes at positions 203 and 204 increase phosphorylase affinity, leading to greater N protein phosphorylation, which is positively correlated to the efficiency of viral genome processing and nucleocapsid assembly. A highly phosphorylated S206, which lies in the critical linker region, likely improves the mutants’ viral fitness.
Finally, Mourier et al. examine the relationship between the N protein mutations and pathogenesis. They found that the mutant N protein upregulates over 100 interferon-related genes in infected host cells. Interferon genes allow for communication between cells to trigger protective defenses by the immune system to eradicate pathogens. Among these defenses are events like cytokine storms, which are floods of circulating cytokines and activated immune cells, leading to severe disease symptoms. The mutant N protein is more likely to induce severe disease as a result of this upregulation.
These results align with a study from early November 2021 by Syed et al. that examined N protein mutations in the Delta virus. Using virus-like particles in a lab setting, Syed et al. analyzed dozens of N protein mutations for impacts on transmissibility and virulence. Their examination also highlighted the linker region, wherein they noted that all natural variants of concern or interest contained at least one amino acid mutation between positions 199 and 205. One such mutation, S202R, resulted in 166-fold higher infectious titers as compared to the wild-type. While R203K and G204R do not impact viral efficiency this extremely, the congruence between these two studies is notable.
The N protein mutations are but one example among many. The SARS-CoV-2 genome is roughly 30,000 amino acids long and any one of those can be mutated in dozens of different ways. While most mutations will not make much of an impact, many can and do. Using the sort of analyses that Mourier et al. conduct, we can monitor emerging variants for dangerous mutations to estimate their transmissibility, virulence, and pathogenesis before they begin to impact the population, and then we may adjust our Covid counterstrategies accordingly.