Comparative genomics of the Microviridae
The sequence of phage ΦCA82 was compared to 14 other members of the Microviridae obtained from the integrated microbial genomes (IMG) system. To first determine nucleotide level similarities, tetra-nucleotide comparisons between genomes were performed with jspecies. Pairwise genome comparisons were based on regressions of normalized tetra-nucleotide frequency counts and the distributions of the R2 values from these comparisons were visualized in R. To compare genomes based on similarity of predicted gene sequences, the program CD-HIT was used.
The entire circular, single-stranded nucleotide sequence for the uncultured microvirus ΦCA82 genome was determined to be 5,514 nucleotides. The complete genome sequence had a nucleotide composition of A (38.6%), C (19.6%), G (20.1%), and T (21.6%) with an overall G + C content of 39.7%, which is similar to the chlamydial phages (37-40%). The ΦCA82 genome was organized in a modular arrangement similar to microviruses and encoded predicted proteins homologous to those chlamydial bacteriophage types and to the Bdellovibrio bacteriovorus ΦMH2K. The coding capacity of the genome is 91% as it encodes ten ORFs, greater than 99 nucleotides similarly to other chlamydial microvirus genomes. The genome size, number of ORFs and total coding % of nucleotides as depicted in Figure 1 is larger than most of the chlamydial phages and is closer in size to the ΦX174 genome.
A total of ten genes could be identified of which only three gene products could be assigned with a known function based upon BLAST analysis. The predicted major capsid protein VP1 encoded by ORF1 belongs to the family of single-stranded bacteriophages and is the major structural component of the virion that may contain as many as 60 copies of the protein. The closest sequence similarity of the 565 amino acid ΦCA82 VP1 protein was with the Spiroplasma phage 4 (SpV4) capsid protein and the chlamydial phage VP1 proteins, as well as the Chlamydia prophage CPAR39 and Bdellovibrio phage ΦMH2K major capsid protein. A putative minor capsid protein of 234 amino acids was encoded by ORF2 that had similarity to the chlamydial bacteriophages and the Bdellovibrio phage ΦMH2K that was originally postulated to be an attachment protein.
Recent studies using a comparative metagenomic analysis of viral communities associated with marine and freshwater microbialites indicated that identifiable sequences in these were dominated by single-stranded DNA microphages [25]. Partial sequence analysis of the VP1 gene from these microphages showed that the similarity between metagenomic clones and cultured microphage capsid sequences ranged from 47.5 to 61.2% at the nucleic-acid level and from 37.2 to 69.3% at the protein level, respectively. Interestingly, the VP1 gene of ΦCA82 has a similarly high level of sequence similarity (69.1% at the amino acid level) with the seawater metagenomic phages within the same VP1 region (data not shown). This observation is consistent with an environmental origin of modern poultry phages that have since undergone significant host-specific evolutionary divergence in agricultural settings.
A multiple alignment of major capsid proteins among diverse members shows similarities within the entire predicted coding region with the exception of the predicted surface-exposed IN5 loop and Ins. Amino acids located within these regions are involved in forming large protrusions at the threefold icosahedral axes of symmetry in the intracellular microvirus phages. The IN5 loop, forming a globular protrusion on the virus coat and is the most variable region in the VP1 proteins from Chlamydia and Spiroplasma phages is potentially located from residues 198 through 295 of ΦCA82 VP1, which is the most highly variable portion of the protein by BLAST. The hydrophobic nature of the cavity at the distal surface of the SpV4 protrusions suggests that this region may function as the receptor-recognition site during host infection. The short variable Ins sequences of ΦCA82 are putatively located from residues 459 through 464 of the VP1 protein.