CpG and UpA dinucleotides are under-represented in vertebrate genomes, whereas most invertebrates only show a bias against UpA. RNA viruses are thought to have evolved genomes that resemble the dinucleotide composition of their hosts, possibly to avoid restriction by the zinc-finger antiviral protein (ZAP). By performing a comprehensive analysis of RNA viruses, we show that, whereas UpA dinucleotides are similarly under-represented irrespective of viral genome composition or host, important differences are observed for CpG. The tendency for vertebrate-infecting viruses to have stronger CpG bias than invertebrate-infecting viruses is not universal. Rather, it is mainly driven by single-stranded (ss) RNA(+) viruses. Conversely, ssRNA(−) viruses have a dinucleotide composition that is unrelated to the host clade. Also, these viruses, especially those in the order Bunyavirales, are extremely CpG-depleted. By focusing on specific viral families, we also show that, even for vertebrate ssRNA(+) viruses, ZAP is unlikely to be a driver of CpG depletion. Consistently, CpG dinucleotides tend to be preferentially depleted in A/U-rich contexts in both vertebrate- and invertebrate-infecting viruses. Finally, within the same viral genomes, individual viral open reading frames (ORFs) can display different CpG content. Analysis of SARS-CoV-2 revealed a remarkable depletion of CpG dinucleotides in ORF1ab and S, but not in N and M. Thus, these results do not support the view that an adaptive shift for CpG depletion in the SARS-CoV-2 lineage occurred as an innate immunity evasion strategy. Our data provide a better understanding of viral evolution and inform approaches based on the modulation of CpG to generate attenuated viruses. IMPORTANCE Akin to a molecular signature, dinucleotide composition can be exploited by the zinc-finger antiviral protein (ZAP) to restrict CpG-rich (and UpA-rich) RNA viruses. ZAP evolved in tetrapods, and it is not encoded by invertebrates and fish. Because a systematic analysis is missing, we analyzed the genomes of RNA viruses that infect vertebrates or invertebrates. We show that vertebrate single-stranded (ss) RNA(+) viruses and, to a lesser extent, double-stranded RNA viruses tend to have stronger CpG bias than invertebrate viruses. Conversely, ssRNA(−) viruses have similar dinucleotide composition whether they infect vertebrates or invertebrates. Analysis of ssRNA(+) viruses that infect mammals, reptiles, and fish indicated that ZAP is unlikely to be a major driver of CpG depletion. We also show that, compared to other coronaviruses, the genome of SARSCoV-2 is not homogeneously CpG-depleted. Our study provides new insights into virus evolution and strategies for recoding RNA virus genomes.
Forni, D., Pozzoli, U., Cagliani, R., Clerici, M., Sironi, M. (2023). Dinucleotide biases in RNA viruses that infect vertebrates or invertebrates. MICROBIOLOGY SPECTRUM, 11(6) [10.1128/spectrum.02529-23].
Dinucleotide biases in RNA viruses that infect vertebrates or invertebrates
Sironi M
2023
Abstract
CpG and UpA dinucleotides are under-represented in vertebrate genomes, whereas most invertebrates only show a bias against UpA. RNA viruses are thought to have evolved genomes that resemble the dinucleotide composition of their hosts, possibly to avoid restriction by the zinc-finger antiviral protein (ZAP). By performing a comprehensive analysis of RNA viruses, we show that, whereas UpA dinucleotides are similarly under-represented irrespective of viral genome composition or host, important differences are observed for CpG. The tendency for vertebrate-infecting viruses to have stronger CpG bias than invertebrate-infecting viruses is not universal. Rather, it is mainly driven by single-stranded (ss) RNA(+) viruses. Conversely, ssRNA(−) viruses have a dinucleotide composition that is unrelated to the host clade. Also, these viruses, especially those in the order Bunyavirales, are extremely CpG-depleted. By focusing on specific viral families, we also show that, even for vertebrate ssRNA(+) viruses, ZAP is unlikely to be a driver of CpG depletion. Consistently, CpG dinucleotides tend to be preferentially depleted in A/U-rich contexts in both vertebrate- and invertebrate-infecting viruses. Finally, within the same viral genomes, individual viral open reading frames (ORFs) can display different CpG content. Analysis of SARS-CoV-2 revealed a remarkable depletion of CpG dinucleotides in ORF1ab and S, but not in N and M. Thus, these results do not support the view that an adaptive shift for CpG depletion in the SARS-CoV-2 lineage occurred as an innate immunity evasion strategy. Our data provide a better understanding of viral evolution and inform approaches based on the modulation of CpG to generate attenuated viruses. IMPORTANCE Akin to a molecular signature, dinucleotide composition can be exploited by the zinc-finger antiviral protein (ZAP) to restrict CpG-rich (and UpA-rich) RNA viruses. ZAP evolved in tetrapods, and it is not encoded by invertebrates and fish. Because a systematic analysis is missing, we analyzed the genomes of RNA viruses that infect vertebrates or invertebrates. We show that vertebrate single-stranded (ss) RNA(+) viruses and, to a lesser extent, double-stranded RNA viruses tend to have stronger CpG bias than invertebrate viruses. Conversely, ssRNA(−) viruses have similar dinucleotide composition whether they infect vertebrates or invertebrates. Analysis of ssRNA(+) viruses that infect mammals, reptiles, and fish indicated that ZAP is unlikely to be a major driver of CpG depletion. We also show that, compared to other coronaviruses, the genome of SARSCoV-2 is not homogeneously CpG-depleted. Our study provides new insights into virus evolution and strategies for recoding RNA virus genomes.File | Dimensione | Formato | |
---|---|---|---|
Forni-2023-Microbiology Spectrum-VoR.pdf
accesso aperto
Descrizione: CC BY 4.0 This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
5.81 MB
Formato
Adobe PDF
|
5.81 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.