The genomes of cellular organisms display CpG and TpA dinucleotide composition biases. Such biases have been poorly investigated in dsDNA viruses. Here, we show that in dsDNA virus, bacterial, and eukaryotic genomes, the representation of TpA and CpG dinucleotides is strongly dependent on genomic G + C content. Thus, the classical observed/expected ratios do not fully capture dinucleotide biases across genomes. Because a larger portion of the variance in TpA frequency was explained by G + C content, we explored which additional factors drive the distribution of CpG dinucleotides. Using the residuals of the linear regressions as a measure of dinucleotide abundance and ancestral state reconstruction across eukaryotic and prokaryotic virus trees, we identified an important role for phylogeny in driving CpG representation. Nonetheless, phylogenetic ANOVA analyses showed that few host associations also account for significant variations. Among eukaryotic viruses, most significant differences were observed between arthropod-infecting viruses and viruses that infect vertebrates or unicellular organisms. However, an effect of viral DNA methylation status (either driven by the host or by viral-encoded methyltransferases) is also likely. Among prokaryotic viruses, cyanobacteria-infecting phages resulted to be significantly CpG-depleted, whereas phages that infect bacteria in the genera Burkolderia and Staphylococcus were CpG-rich. Comparison with bacterial genomes indicated that this effect is largely driven by the general tendency for phages to resemble the host's genomic CpG content. Notably, such tendency is stronger for temperate than for lytic phages. Our data shed light into the processes that shape virus genome composition and inform manipulation strategies for biotechnological applications.

Forni, D., Pozzoli, U., Cagliani, R., Sironi, M. (2024). Dinucleotide biases in the genomes of prokaryotic and eukaryotic dsDNA viruses and their hosts. MOLECULAR ECOLOGY, 33(6) [10.1111/mec.17287].

Dinucleotide biases in the genomes of prokaryotic and eukaryotic dsDNA viruses and their hosts

Sironi M
2024

Abstract

The genomes of cellular organisms display CpG and TpA dinucleotide composition biases. Such biases have been poorly investigated in dsDNA viruses. Here, we show that in dsDNA virus, bacterial, and eukaryotic genomes, the representation of TpA and CpG dinucleotides is strongly dependent on genomic G + C content. Thus, the classical observed/expected ratios do not fully capture dinucleotide biases across genomes. Because a larger portion of the variance in TpA frequency was explained by G + C content, we explored which additional factors drive the distribution of CpG dinucleotides. Using the residuals of the linear regressions as a measure of dinucleotide abundance and ancestral state reconstruction across eukaryotic and prokaryotic virus trees, we identified an important role for phylogeny in driving CpG representation. Nonetheless, phylogenetic ANOVA analyses showed that few host associations also account for significant variations. Among eukaryotic viruses, most significant differences were observed between arthropod-infecting viruses and viruses that infect vertebrates or unicellular organisms. However, an effect of viral DNA methylation status (either driven by the host or by viral-encoded methyltransferases) is also likely. Among prokaryotic viruses, cyanobacteria-infecting phages resulted to be significantly CpG-depleted, whereas phages that infect bacteria in the genera Burkolderia and Staphylococcus were CpG-rich. Comparison with bacterial genomes indicated that this effect is largely driven by the general tendency for phages to resemble the host's genomic CpG content. Notably, such tendency is stronger for temperate than for lytic phages. Our data shed light into the processes that shape virus genome composition and inform manipulation strategies for biotechnological applications.
Articolo in rivista - Articolo scientifico
CpG; dinucleotide composition; DNA virus; GC content; UpA;
English
23-gen-2024
2024
33
6
e17287
open
Forni, D., Pozzoli, U., Cagliani, R., Sironi, M. (2024). Dinucleotide biases in the genomes of prokaryotic and eukaryotic dsDNA viruses and their hosts. MOLECULAR ECOLOGY, 33(6) [10.1111/mec.17287].
File in questo prodotto:
File Dimensione Formato  
Forni-2024-Molecular Ecology-VoR.pdf

accesso aperto

Descrizione: CC BY-NC 4.0 This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License
Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 7.17 MB
Formato Adobe PDF
7.17 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/509659
Citazioni
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
Social impact