973 resultados para Reading frames
Resumo:
Protein tyrosine phosphatases (PTPs) are comprised of two superfamilies, the phosphatase I superfamily containing a single low-molecular-weight PTP (lmwPTP) family and the phosphatase II superfamily including both the higher-molecular-weight PTP (hmwPTP) and the dual-specificity phosphatase (DSP) families. The phosphatase I and H superfamilies are often considered to be the result of convergent evolution. The PTP sequence and structure analyses indicate that lmwPTPs, hmwPTPs, and DSPs share similar structures, functions, and a common signature motif, although they have low sequence identities and a different order of active sites in sequence or a circular permutation. The results of this work suggest that lmwPTPs and hmwPTPs/DSPs are remotely related in evolution. The earliest ancestral gene of PTPs could be from a short fragment containing about 90similar to120 nucleotides or 30similar to40 residues; however, a probable full PTP ancestral gene contained one transcript unit with two lmwPTP genes. All three PTP families may have resulted from a common ancestral gene by a series of duplications, fusions, and circular permutations. The circular permutation in PTPs is caused by a reading frame difference, which is similar to that in DNA methyltransferases. Nevertheless, the evolutionary mechanism of circular permutation in PTP genes seems to be more complicated than that in DNA methyltransferase genes. Both mechanisms in PTPs and DNA methyltransferases can be used to explain how some protein families and superfamilies came to be formed by circular permutations during molecular evolution.
Resumo:
VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.
Resumo:
In late 1994 and early 1995, Ebola (EBO) virus dramatically reemerged in Africa, causing human disease in the Ivory Coast and Zaire. Analysis of the entire glycoprotein genes of these viruses and those of other EBO virus subtypes has shown that the virion glycoprotein (130 kDa) is encoded in two reading frames, which are linked by transcriptional editing. This editing results in the addition of an extra nontemplated adenosine within a run of seven adenosines near the middle of the coding region. The primary gene product is a smaller (50-70 kDa), nonstructural, secreted glycoprotein, which is produced in large amounts and has an unknown function. Phylogenetic analysis indicates that EBO virus subtypes are genetically diverse and that the recent Ivory Coast isolate represents a new (fourth) subtype of EBO virus. In contrast, the EBO virus isolate from the 1995 outbreak in Kikwit, Zaire, is virtually identical to the virus that caused a similar epidemic in Yambuku, Zaire, almost 20 years earlier. This genetic stability may indicate that EBO viruses have coevolved with their natural reservoirs and do not change appreciably in the wild.
Resumo:
Background: Approximately 40% of mammalian mRNA sequences contain AUG trinucleotides upstream of the main coding sequence, with a quarter of these AUGs demarcating open reading frames of 20 or more codons. In order to investigate whether these open reading frames may encode functional peptides, we have carried out a comparative genomic analysis of human and mouse mRNA 'untranslated regions' using sequences from the RefSeq mRNA sequence database. Results: We have identified over 200 upstream open reading frames which are strongly conserved between the human and mouse genomes. Consensus sequences associated with efficient initiation of translation are overrepresented at the AUG trinucleotides of these upstream open reading frames, while comparative analysis of their DNA and putative peptide sequences shows evidence of purifying selection. Conclusion: The occurrence of a large number of conserved upstream open reading frames, in association with features consistent with protein translation, strongly suggests evolutionary maintenance of the coding sequence and indicates probable functional expression of the peptides encoded within these upstream open reading frames.
Resumo:
Background: Translational errors can result in bypassing of the main viral protein reading frames and the production of alternate reading frame (ARF) or cryptic peptides. Within HIV, there are many such ARFs in both sense and the antisense directions of transcription. These ARFs have the potential to generate immunogenic peptides called cryptic epitopes (CE). Both antiretroviral drug therapy and the immune system exert a mutational pressure on HIV-1. Immune pressure exerted by ARF CD8(+) T cells on the virus has already been observed in vitro. HAART has also been described to select HIV-1 variants for drug escape mutations. Since the mutational pressure exerted on one location of the HIV-1 genome can potentially affect the 3 reading frames, we hypothesized that ARF responses would be affected by this drug pressure in vivo. Methodology/Principal findings: In this study we identified new ARFs derived from sense and antisense transcription of HIV-1. Many of these ARFs are detectable in circulating viral proteins. They are predominantly found in the HIV-1 env nucleotide region. We measured T cell responses to 199 HIV-1 CE encoded within 13 sense and 34 antisense HIV-1 ARFs. We were able to observe that these ARF responses are more frequent and of greater magnitude in chronically infected individuals compared to acutely infected patients, and in patients on HAART, the breadth of ARF responses increased. Conclusions/Significance: These results have implications for vaccine design and unveil the existence of potential new epitopes that could be included as vaccine targets.
Resumo:
The DNA of three biological variants, G1, Ic and G2, which originated from the same greenhouse isolate of rice tungro bacilliform virus (RTBV) at the International Rice Research Institute (IRRI), was cloned and sequenced. Comparison of the sequences revealed small differences in genome sizes. The variants were between 95 and 99% identical at the nucleotide and amino acid levels. Alignment of the three genome sequences with those of three published RTBV sequences (Phi-1, Phi-2 and Phi-3) revealed numerous nucleotide substitutions and some insertions and deletions. The published RTBV sequences originated from the same greenhouse isolate at IRRI 20, 11 and 9 years ago. All open reading frames (ORFs) and known functional domains were conserved across the six variants. The cysteine-rich region of ORF3 showed the greatest variation. When the six DNA sequences from IRRI were compared with that of an isolate from Malaysia (Serdang), similar changes were observed in the cysteine-rich region in addition to other nucleotide substitutions and deletions across the genome. The aligned nucleotide sequences of the IRRI variants and Serdang were used to analyse phylogenetic relationships by the bootstrapped parsimony, distance and maximum-likelihood methods. The isolates clustered in three groups: Serdang alone; Ic and G1; and Phi-1, Phi-2, Phi-3 and G2. The distribution of phylogenetically informative residues in the IRRI sequences shared with the Serdang sequence and the differing tree topologies for segments of the genome suggested that recombination, as well as substitutions and insertions or deletions, has played a role in the evolution of RTBV variants. The significance and implications of these evolutionary forces are discussed in comparison with badnaviruses and caulimoviruses.
Resumo:
Rice grassy stunt virus is a member of the genus Tenuivirus, is persistently transmitted by a brown planthopper, and has occurred in rice plants in South, Southeast, and East Asia (similar to North and South America). We determined the complete nucleotide (nt) sequences of RNAs 1 (9760 nt), 2 (4069 nt), 3 (3127 nt), 4 (2909 nt), 5 (2704 nt), and 6 (2590 nt) of a southern Philippine isolate from South Cotabato and compared them with those of a northern Philippine isolate from Laguna (Toriyama et al., 1997, 1998). The numbers of nucleotides in the terminal untranslated regions and open reading frames were identical between the two isolates except for the 5′ untranslated region of the complementary strand of RNA 4. Overall nucleotide differences between the two isolates were only 0.08% in RNA 1, 0.58% in RNA 4, and 0.26% in RNA 5, whereas they were 2.19% in RNA 2, 8.38% in RNA 3, and 3.63% in RNA 6. In the intergenic regions, the two isolates differed by 9.12% in RNA 2, 11.6% in RNA 3, and 6.86% in RNA 6 with multiple consecutive nucleotide deletion/insertions, whereas they differed by only 0.78% in RNA 4 and 0.34% in RNA 5. The nucleotide variation in the intergenic region of RNA 6 within the South Cotabato isolate was only 0.33%. These differences in accumulation of mutations among individual RNA segments indicate that there was genetic reassortment in the two geographical isolates; RNAs 1, 4, and 5 of the two isolates came from a common ancestor, whereas RNAs 2, 3, and 6 were from two different ancestors.
Resumo:
This paper describes the cloning and characterization of a new member of the vascular endothelial growth factor (VEGF) gene family, which we have designated VRF for VEGF-related-factor. Sequencing of cDNAs from a human fetal brain library and RT-PCR products from normal and tumor tissue cDNA pools indicate two alternatively spliced messages with open reading frames of 621 and 564 bp, respectively. The predicted proteins differ at their carboxyl ends resulting from a shift in the open reading frame. Both isoforms show strong homology to VEGF at their amino termini, but only the shorter isoform maintains homology to VEGF at its carboxyl terminus and conserves all 16 cysteine residues of VEGF165. Similarity comparisons of this isoform revealed overall protein identity of 48% and conservative substitution of 69% with VEGF189. VRF is predicted to contain a signal peptide, suggesting that it may be a secreted factor. The VRF gene maps to the D11S750 locus at chromosome band 11q13, and the protein coding region, spanning approximately 5 kb, is comprised of 8 exons that range in size from 36 to 431 bp. Exons 6 and 7 are contiguous and the two isoforms of VRF arise through alternate splicing of exon 6. VRF appears to be ubiquitously expressed as two transcripts of 2.0 and 5.5 kb; the level of expression is similar among normal and malignant tissues.
Resumo:
Bananas are one of the world's most important food crops, providing sustenance and income for millions of people in developing countries and supporting large export industries. Viruses are considered major constraints to banana production, germplasm multiplication and exchange, and to genetic improvement of banana through traditional breeding. In Africa, the two most important virus diseases are bunchy top, caused by Banana bunchy top virus (BBTV), and banana streak disease, caused by Banana streak virus (BSV). BBTV is a serious production constraint in a number of countries within/bordering East Africa, such as Burundi, Democratic Republic of Congo, Malawi, Mozambique, Rwanda and Zambia, but is not present in Kenya, Tanzania and Uganda. Additionally, epidemics of banana streak disease are occurring in Kenya and Uganda. The rapidly growing tissue culture (TC) industry within East Africa, aiming to provide planting material to banana farmers, has stimulated discussion about the need for virus indexing to certify planting material as virus-free. Diagnostic methods for BBTV and BSV have been reported and, for BBTV, PCR-based assays are reliable and relatively straightforward. However for BSV, high levels of serological and genetic variability and the presence of endogenous virus sequences within the banana genome complicate diagnosis. Uganda has been shown to contain the greatest diversity in BSV isolates found anywhere in the world. A broad-spectrum diagnostic test for BSV detection, which can discriminate between endogenous and episomal BSV sequences, is a priority. This PhD project aimed to establish diagnostic methods for banana viruses, with a particular focus on the development of novel methods for BSV detection, and to use these diagnostic methods for the detection and characterisation of banana viruses in East Africa. A novel rolling-circle amplification (RCA) method was developed for the detection of BSV. Using samples of Banana streak MY virus (BSMYV) and Banana streak OL virus (BSOLV) from Australia, this method was shown to distinguish between endogenous and episomal BSV sequences in banana plants. The RCA assay was used to screen a collection of 56 banana samples from south-west Uganda for BSV. RCA detected at least five distinct BSV isolates in these samples, including BSOLV and Banana streak GF virus (BSGFV) as well as three BSV isolates (Banana streak Uganda-I, -L and -M virus) for which only partial sequences had been previously reported. These latter three BSV had only been detected using immuno-capture (IC)-PCR and thus were possible endogenous sequences. In addition to its ability to detect BSV, the RCA protocol was also demonstrated to detect other viruses within the family Caulimoviridae, including Sugar cane bacilliform virus, and Cauliflower mosaic virus. Using the novel RCA method, three distinct BSV isolates from both Kenya and Uganda were identified and characterised. The complete genome of these isolates was sequenced and annotated. All six isolates were shown to have a characteristic badnavirus genome organisation with three open reading frames (ORFs) and the large polyprotein encoded by ORF 3 was shown to contain conserved amino acid motifs for movement, aspartic protease, reverse transcriptase and ribonuclease H activities. As well, several sequences important for expression and replication of the virus genome were identified including the conserved tRNAmet primer binding site present in the intergenic region of all badnaviruses. Based on the International Committee on Taxonomy of Viruses (ICTV) guidelines for species demarcation in the genus Badnavirus, these six isolates were proposed as distinct species, and named Banana streak UA virus (BSUAV), Banana streak UI virus (BSUIV), Banana streak UL virus (BSULV), Banana streak UM virus (BSUMV), Banana streak CA virus (BSCAV) and Banana streak IM virus (BSIMV). Using PCR with species-specific primers designed to each isolate, a genotypically diverse collection of 12 virus-free banana cultivars were tested for the presence of endogenous sequences. For five of the BSV no amplification was observed in any cultivar tested, while for BSIMV, four positive samples were identified in cultivars with a B-genome component. During field visits to Kenya, Tanzania and Uganda, 143 samples were collected and assayed for BSV. PCR using nine sets of species-specific primers, and RCA, were compared for BSV detection. For five BSV species with no known endogenous counterpart (namely BSCAV, BSUAV, BSUIV, BSULV and BSUMV), PCR was used to detect 30 infections from the 143 samples. Using RCA, 96.4% of these samples were considered positive, with one additional sample detected using RCA which was not positive using PCR. For these five BSV, PCR and RCA were both useful for identifying infected samples, irrespective of the host cultivar genotype (Musa A- or B-genome components). For four additional BSV with known endogenous counterparts in the M. balbisiana genome (BSOLV, BSGFV, BSMYV and BSIMV), PCR was shown to detect 75 infections from the 143 samples. In 30 samples from cultivars with an A-only genome component there was 96.3% agreement between PCR positive samples and detection using RCA, again demonstrating either PCR or RCA are suitable methods for detection. However, in 45 samples from cultivars with some B-genome component, the level of agreement between PCR positive samples and RCA positive samples was 70.5%. This suggests that, in cultivars with some B-genome component, many infections were detected using PCR which were the result of amplification of endogenous sequences. In these latter cases, RCA or another method which discriminates between endogenous and episomal sequences, such as immuno-capture PCR, is needed to diagnose episomal BSV infection. Field visits were made to Malawi and Rwanda to collect local isolates of BBTV for validation of a PCR-based diagnostic assay. The presence of BBTV in samples of bananas with bunchy top disease was confirmed in 28 out of 39 samples from Malawi and all nine samples collected in Rwanda, using PCR and RCA. For three isolates, one from Malawi and two from Rwanda, the complete nucleotide sequences were determined and shown to have a similar genome organisation to previously published BBTV isolates. The two isolates from Rwanda had at least 98.1% nucleotide sequence identity between each of the six DNA components, while the similarity between isolates from Rwanda and Malawi was between 96.2% and 99.4% depending on the DNA component. At the amino acid level, similarities in the putative proteins encoded by DNA-R, -S, -M, - C and -N were found to range between 98.8% to 100%. In a phylogenetic analysis, the three East African isolates clustered together within the South Pacific subgroup of BBTV isolates. Nucleotide sequence comparison to isolates of BBTV from outside Africa identified India as the possible origin of East African isolates of BBTV.
Resumo:
The nucleotide sequence of the genomic RNA of barley yellow dwarf virus, PAV serotype was determined except for the 5′-terminal base, and its genome organization deduced. The 5,677 nucleotide genome contains five large open reading frames (ORFs). The genes for the coat protein (1) and the putative viral RNA-dependent RNA polymerase were identified. The latter shows a striking degree of similarity to that of carnation mottle virus (CarMV). By comparison with corona- and retrovirus RNAs, it is proposed that a translational frameshift is involved in expression of the polymerase. An ORF encoding an Mr 49,797 protein (50K ORF) may be translated by in-frame readthrough of the coat protein stop codon. The coat protein, an overlapping 17K ORF, and a 3′ 6.7K ORF are likely to be expressed via subgenomic mRNAs. © 1988 IRL Press Limited.
Resumo:
The complete nucleotide sequence of Subterranean clover mottle virus (SCMoV) genomic RNA has been determined. The SCMoV genome is 4,258 nucleotides in length. It shares most nucleotide and amino acid sequence identity with the genome of Lucerne transient streak virus (LTSV). SCMoV RNA encodes four overlapping open reading frames and has a genome organisation similar to that of Cocksfoot mottle virus (CfMV). ORF1 and ORF4 are predicted to encode single proteins. ORF2 is predicted to encode two proteins that are derived from a -1 translational frameshift between two overlapping reading frames (ORF2a and ORF2b). A search of amino acid databases did not find a significant match for ORF1 and the function of this protein remains unclear. ORF2a contains a motif typical of chymotrypsin-like serine proteases and ORF2b has motifs characteristically present in positive-stranded RNA-dependent RNA polymerases. ORF4 is likely to be expressed from a subgenomic RNA and encodes the viral coat protein. The ORF2a/ORF2b overlapping gene expression strategy used by SCMoV and CfMV is similar to that of the poleroviruses and differ from that of other published sobemoviruses. These results suggest that the sobemoviruses could now be divided into two distinct subgroups based on those that express the RNA-dependent RNA polymerase from a single, in-frame polyprotein, and those that express it via a -1 translational frameshifting mechanism.
Resumo:
The complete nucleotide sequence of genome segment S4 of rice ragged stunt oryzavirus (RRSV, Thai-isolate) was determined. The 3823 bp sequence contains two large open reading frames (ORFs). ORF1, spanning nucleotides 12 to 3776, is capable of encoding a protein of M(r) 141,380 (P4a). The P4a amino acid sequence predicted from the nucleotide sequence contains sequence motifs conserved in RNA-dependent RNA polymerases (RDRPs). When compared for evolutionary relationships with RDRPs of other reoviruses using the amino acid sequences around the conserved GDD motif, P4a was shown to be more related to Nilaparvata lugens reovirus and reovirus serotype 3 than to rice dwarf phytoreovirus, bovine rotavirus or bluetongue virus. The ORF2, spanning nucleotides 491 to 1468, is out of frame with ORF1 and is capable of encoding a protein of 36, 920 (P4b). Coupled in vitro transcription-translation from cloned ORF2 in wheat germ extract confirmed the existence of ORF2 but in vivo production and possible function of P4b is yet to be determined.
Resumo:
The nucleotide sequence of DNA complementary to rice ragged stunt oryzavirus (RRSV) genome segment 8 (S8) of an isolate from Thailand was determined. RRSV S8 is 1 914 bp in size and contains a single large open reading frame (ORF) spanning nucleotides 23 to 1 810 which is capable of encoding a protein of M(r) 67 348. The N-terminal amino acid sequence of a ~43K virion polypeptide matched to that inferred for an internal region of the S8 coding sequence. These data suggest that the 43K protein is encoded by S8 and is derived by a proteolytic cleavage. Predicted polypeptide sizes from this possible cleavage of S8 protein are 26K and 42K. Polyclonal antibodies raised against a maltose binding protein (MBP)-S8 fusion polypeptide (expressed in Escherichia coli) recognised four RRSV particle associated polypeptides of M(r) 67K, 46K, 43K and 26K and all except the 26K polypeptide were also highly immunoreactive to polyclonal antibodies raised against purified RRSV particles. Cleavage of the MBP-S8 fusion polypeptide with protease Factor X produced the expected 40K MBP and two polypeptides of apparent M(r) 46K and 26K. Antibodies to purified RRSV particles reacted strongly with the intact fusion protein and the 46K cleavage product but weakly to the 26K product. Furthermore, in vitro transcription and translation of the S8 coding region revealed a post-translational self cleavage of the 67K polypeptide to 46K and 26K products. These data indicate that S8 encodes a structural polypeptide, the majority of which is auto- catalytically cleaved to 26K and 46K proteins. The data also suggest that the 26K protein is the self cleaving protease and that the 46K product is further processed or undergoes stable conformational changes to a ~43K major capsid protein.
Resumo:
The genomic sequence of an Australian isolate of carrot mottle umbravirus (CMoV-A) was determined from cDNA generated from dsRNA. This provides the first data on the genome organization and phylogeny of an umbravirus. The 4201-nucleotide genome contains four major open reading frames (ORFs). Analysis suggests that ORF2 encodes an RNA-dependent RNA polymerase, that ORF4 encodes a movement protein, and that the virus has no coat protein gene. The functions of ORFs 1 and 3 remain unknown. ORF2 is probably translated following ribosomal frameshifting. ORFs 3 and 4 are probably translated from a subgenomic mRNA. Sequence comparisons showed CMoV-A to be closely related to pea enation mosaic RNA2 NA2), but also to have affinities with the Bromoviridae. These findings shed light on the relationships between the luteoviruses, PEMV, and the umbraviruses and on the relationships between the carmo-like viruses and the Bromoviridae.
Resumo:
Complementary DNAs covering the entire RNA genome of soybean dwarf luteovirus (SDV) were cloned and sequenced. Computer analysis of the 5861 nucleotide sequence revealed five major open reading frames (ORFs) possessing conservation of sequence and organisation with known luteovirus sequences. Comparative analyses of the genome structure show that SDV shares sequence homology and features of gene organisation with barley yellow dwarf virus (PAV isolate) in the 5' half of the genome, yet is more closely related to potato leafroll virus in its 3' coding regions. In addition, SDV differs from other known luteoviruses in possessing an exceptionally long 3' terminal sequence with no apparent coding capacity. We conclude from these data that the SDV genome represents a third variant genome type in the luteovirus group.