Novel snake papillomavirus does not cluster with other non-mammalian papillomaviruses. Part 2

Total DNA from a 25 mg tissue sample was isolated using a QIAamp DNA extraction kit (Qiagen) according to the manufacturer’s recommendations. One microliter of the extracted DNA was used for RCA, using a TempliPhi Amplification kit (General Electrics Biosciences). Slight modifications were applied to the protocol supplied by the manufacturer: 1 μl of 10 mM dNTPs was added and the reaction time was prolonged to 16 h at 30°C. Amplified DNA was cloned into the EcoRI or XhoI site of pBluescript II KS+ (Stratagene) using standard procedures.

The nucleotide sequence of cloned DNA and of precipitated RCA product was determined (Microsynth) on both strands by cycle sequencing using an ABI 377 sequencer (Applied Biosystems). The primary sequences were assembled using Contigexpress software (Vector NTI Informax).

Analyses of the cloned sequence confirmed that papillomavirus DNA had actually been detected. The amplified genome consists of 7048 bp and has a GC content of 41%. ORFs putatively encoding E6, E7, E1, E2, E4, L2 and L1 but no E5 were identified. Deduced amino acid sequences of the putative proteins revealed a degenerate ATP-dependent helicase motif GQPNTGKS in E1, two putative metal-binding motifs in E6, one such motif in E7, and one pRb binding domain. The deduced amino acid sequences of both structural proteins L1 and L2 are predicted to harbour a basic tail at their C termini.

A non coding region (NCR1) between the stop-codon of the L1 ORF and the start-codon of the E6 ORF was 473 nt in length. A second non coding region (NCR2) of 178 nt was between the stop-codon of the E2 and the start-codon of the L2 ORF.

In addition, papillomavirus-specific DNA motifs were identified in the readily determined genome sequence. Four putative consensus sequences for E2 binding (ACCN5-7GGT) were detected; two of these were located in the NCR1 (positions 6919-6931 and 32-43) and two were located within the predicted L1 ORF (positions 5936-5948 and 5990-6000). Within the NCR1, a putative origin of DNA replication was identified, consisting of two E2-binding regions flanking a region with 54% A/T content. Two polyadenylation consensus sequences (AATAAA) were predicted, one within the NCR1 (position 6729-6734) and the other within the NCR2 (position 3816-3821).

In order to possibly allocate the novel PV in the evolutionary context, phylogeny based on the aligned E1-E2-L2-L1 sequences was determined. Sequences of fifty PVs, representing all presently classified genera and MsPV1 were included in these analyses. While all other sauropsid PVs clustered together, MsPV1 was located far from any of them. Interestingly, MsPV1 was found in relative proximity to the PV (TmPV1) of a marine mammal, the manatee (sea cow).