999 resultados para Codon Usage Bias
Resumo:
Different codons encoding the same amino acid are not used equally in protein-coding sequences. In bacteria, there is a bias towards codons with high translation rates. This bias is most pronounced in highly expressed proteins, but a recent study of synthetic GFP-coding sequences did not find a correlation between codon usage and GFP expression, suggesting that such correlation in natural sequences is not a simple property of translational mechanisms. Here, we investigate the effect of evolutionary forces on codon usage. The relation between codon bias and protein abundance is quantitatively analyzed based on the hypothesis that codon bias evolved to ensure the efficient usage of ribosomes, a precious commodity for fast growing cells. An explicit fitness landscape is formulated based on bacterial growth laws to relate protein abundance and ribosomal load. The model leads to a quantitative relation between codon bias and protein abundance, which accounts for a substantial part of the observed bias for E. coli. Moreover, by providing an evolutionary link, the ribosome load model resolves the apparent conflict between the observed relation of protein abundance and codon bias in natural sequences and the lack of such dependence in a synthetic gfp library. Finally, we show that the relation between codon usage and protein abundance can be used to predict protein abundance from genomic sequence data alone without adjustable parameters.
Resumo:
Different codons encoding the same amino acid are not used equally in protein-coding sequences. In bacteria, there is a bias towards codons with high translation rates. This bias is most pronounced in highly expressed proteins, but a recent study of synthetic GFP-coding sequences did not find a correlation between codon usage and GFP expression, suggesting that such correlation in natural sequences is not a simple property of translational mechanisms. Here, we investigate the effect of evolutionary forces on codon usage. The relation between codon bias and protein abundance is quantitatively analyzed based on the hypothesis that codon bias evolved to ensure the efficient usage of ribosomes, a precious commodity for fast growing cells. An explicit fitness landscape is formulated based on bacterial growth laws to relate protein abundance and ribosomal load. The model leads to a quantitative relation between codon bias and protein abundance, which accounts for a substantial part of the observed bias for E. coli. Moreover, by providing an evolutionary link, the ribosome load model resolves the apparent conflict between the observed relation of protein abundance and codon bias in natural sequences and the lack of such dependence in a synthetic gfp library. Finally, we show that the relation between codon usage and protein abundance can be used to predict protein abundance from genomic sequence data alone without adjustable parameters.
Resumo:
Polynucleotide immunisation with the E7 gene of human papillomavirus (HPV) type 16 induces only moderate levels of immune response, which may in part be due to limitation in E7 gene expression influenced by biased HPV codon usage. Here we compare for expression and immunogenicity polynucleotide expression plasmids encoding wild-type (pWE7) or synthetic codon optimised (pHE7) HPV16 E7 DNA. Cos-1 cells transfected with pHE7 expressed higher levels of E7 protein than similar cells transfected with pW7. C57BL/6 mice and F1 (C57X FVB) E7 transgenic mice immunised intradermally with E7 plasmids produced high levels of anti-E7 antibody. pHE7 induced a significantly stronger E7-specific cytotoxic T-lymphocyte response than pWE7 and 100% tumour protection in C57BL/6 mice, but neither vaccine induced CTL in partially E7 tolerant K14E7 transgenic mice. The data indicate that immunogenicity of an E7 polynucleotide vaccine can be enhanced by codon modification. However, this may be insufficient for priming E7 responses in animals with split tolerance to E7 as a consequence of expression of E7 in somatic cells. (C) 2002 Elsevier Science (USA).
Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.
Resumo:
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
Resumo:
Cervical cancer results from cervical infection by human papillomaviruses (HPVs), especially HPV16. An effective vaccine against these HPVs is expected to have a dramatic impact on the incidence of this cancer and its precursor lesions. The leading candidate, a subunit prophylactic HPV virus-like particle (VLP) vaccine, can protect women from HPV infection. An alternative improved vaccine that avoids parenteral injection, that is efficient with a single dose, and that induces mucosal immunity might greatly facilitate vaccine implementation in different settings. In this study, we have constructed a new generation of recombinant Salmonella organisms that assemble HPV16 VLPs and induce high titers of neutralizing antibodies in mice after a single nasal or oral immunization with live bacteria. This was achieved through the expression of a HPV16 L1 capsid gene whose codon usage was optimized to fit with the most frequently used codons in Salmonella. Interestingly, the high immunogenicity of the new recombinant bacteria did not correlate with an increased expression of L1 VLPs but with a greater stability of the L1-expressing plasmid in vitro and in vivo in absence of antibiotic selection. Anti-HPV16 humoral and neutralizing responses were also observed with different Salmonella enterica serovar Typhimurium strains whose attenuating deletions have already been shown to be safe after oral vaccination of humans. Thus, our findings are a promising improvement toward a vaccine strain that could be tested in human volunteers.
Resumo:
The nucleotide sequence of a genomic DNA fragment thought previously to contain the dihydrofolate reductase gene (DFR1) of Saccharomyces cerevisiae by genetic criteria was determined. This DNA fragment of 1784' basepairs contains a large open reading frame from position 800 to 1432, which encodes a enzyme with a predicted molecular weight of 24,229.8 Daltons. Analysis of the amino acid sequence of this protein revealed that the yeast polypep·tide contained 211 amino acids, compared to the 186 residues commonly found in the polypeptides of other eukaryotes. The difference in size of the gene product can be attributed mainly to an insert in the yeast gene. Within this region, several consensus sequences required for processing of yeast nuclear and class II mitochondrial introns were identified, but appear not sufficient for the RNA splicing. The primary structure of the yeast DHFR protein has considerable sequence homology with analogous polypeptides from other organisms, especially in the consensus residues involved in cofactor and/or inhibitor binding. Analysis of the nucleotide sequence also revealed the presence of a number of canonical sequences identified in yeast as having some function in the regulation of gene expression. These include UAS elements (TGACTC) required for tIle amino acid general control response, and "TATA H boxes as well as several consensus sequences thought to be required for transcriptional termination and polyadenylation. Analysis of the codon usage of the yeast DFRl coding region revealed a codon bias index of 0.0083. this valve very close to zero suggestes 3 that the gene is expressed at a relatively low level under normal physiological conditions. The information concerning the organization of the DFRl were used to construct a variety of fusions of its 5' regulatory region with the coding region of the lacZ gene of E. coli. Some of such fused genes encoded a fusion product that expressed in E.coli and/or in yeast under the control of the 5' regulatory elements of the DFR1. Further studies with these fusion constructions revealed that the beta-galactosidase activity encoded on multicopy plasmids was stimulated transiently by prior exposure of yeast host cells to UV light. This suggests that the yeast PFRl gene is indu.ced by UV light and nlay in1ply a novel function of DHFR protein in the cellular responses to DNA damage. Another novel f~ature of yeast DHFR was revealed during preliminary studies of a diploid strain containing a heterozygous DFRl null allele. The strain was constructed by insertion of a URA3 gene within the coding region of DFR1. Sporulation of this diploid revealed that meiotic products segregated 2:0 for uracil prototrophy when spore clones were germinated on medium supplemented with 5-formyltetrahydrofolate (folinic acid). This finding suggests that, in addition to its catalytic activity, the DFRl gene product nlay play some role in the anabolisln of folinic acid. Alternatively, this result may indicate that Ura+ haploid segregants were inviable and suggest that the enzyme has an essential cellular function in this species.
Resumo:
A total of 3,631 expressed sequence tags (ESTs) were established from two size-selected cDNA libraries made from the tetrasporophytic phase of the agarophytic red alga Gracilaria tenuistipitata. The average sizes of the inserts in the two libraries were 1,600 bp and 600 bp, with an average length of the edited sequences of 850 bp. Clustering gave 2,387 assembled sequences with a redundancy of 53%. Of the ESTs, 65% had significant matches to sequences deposited in public databases, 11% to proteins without known function, and 35% were novel. The most represented ESTs were a Na/K-transporting ATPase, a hedgehog-like protein, a glycine dehydrogenase and an actin. Most of the identified genes were involved in primary metabolism and housekeeping. The largest functional group was thus genes involved in metabolism with 14% of the ESTs; other large functional categories included energy, transcription, and protein synthesis and destination. The codon usage was examined using a subset of the data, and the codon bias was found to be limited with all codon combinations used.
Resumo:
On the basis of the sequence of the mitochondrial genome in the flowering plant Arabidopsis thaliana, RNA editing events were systematically investigated in the respective RNA population. A total of 456 C to U, but no U to C, conversions were identified exclusively in mRNAs, 441 in ORFs, 8 in introns, and 7 in leader and trailer sequences. No RNA editing was seen in any of the rRNAs or in several tRNAs investigated for potential mismatch corrections. RNA editing affects individual coding regions with frequencies varying between 0 and 18.9% of the codons. The predominance of RNA editing events in the first two codon positions is not related to translational decoding, because it is not correlated with codon usage. As a general effect, RNA editing increases the hydrophobicity of the coded mitochondrial proteins. Concerning the selection of RNA editing sites, little significant nucleotide preference is observed in their vicinity in comparison to unedited C residues. This sequence bias is, per se, not sufficient to specify individual C nucleotides in the total RNA population in Arabidopsis mitochondria.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
At present a complete mtDNA sequence has been reported for only two hymenopterans, the Old World honey bee, Apis mellifera and the sawfly Perga condei. Among the bee group, the tribe Meliponini (stingless bees) has some distinction due to its Pantropical distribution, great number of species and large importance as main pollinators in several ecosystems, including the Brazilian rain forest. However few molecular studies have been conducted on this group of bees and few sequence data from mitochondrial genomes have been described. In this project, we PCR amplified and sequenced 78% of the mitochondrial genome of the stingless bee Melipona bicolor (Apidae, Meliponini). The sequenced region contains all of the 13 mitochondrial protein-coding genes, 18 of 22 tRNA genes, and both rRNA genes (one of them was partially sequenced). We also report the genome organization (gene content and order), gene translation, genetic code, and other molecular features, such as base frequencies, codon usage, gene initiation and termination. We compare these characteristics of M. bicolor to those of the mitochondrial genome of A. mellifera and other insects. A highly biased A+T content is a typical characteristic of the A. mellifera mitochondrial genome and it was even more extreme in that of M. bicolor. Length and compositional differences between M. bicolor and A. mellifera genes were detected and the gene order was compared. Eleven tRNA gene translocations were observed between these two species. This latter finding was surprising, considering the taxonomic proximity of these two bee tribes. The tRNA Lys gene translocation was investigated within Meliponini and showed high conservation across the Pantropical range of the tribe.
Resumo:
We present here the sequence of the mitochondrial genome of the basidiomycete phytopathogenic hemibiotrophic fungus Moniliophthora perniciosa, causal agent of the Witches` Broom Disease in Theobroma cacao. The DNA is a circular molecule of 109103 base pairs, with 31.9 % GC, and is the largest sequenced so far. This size is due essentially to the presence of numerous non-conserved hypothetical ORFs. It contains the 14 genes coding for proteins involved in the oxidative phosphorylation, the two rRNA genes, one ORF coding for a ribosomal protein (rps3), and a set of 26 tRNA genes that recognize codons for all amino acids. Seven homing endonucleases are located inside introns. Except atp8, all conserved known genes are in the same orientation. Phylogenetic analysis based on the cox genes agrees with the commonly accepted fungal taxonomy. An uncommon feature of this mitochondrial genome is the presence of a region that contains a set of four, relatively small, nested, inverted repeats enclosing two genes coding for polymerases with an invertron-type structure and three conserved hypothetical genes interpreted as the stable integration of a mitochondrial linear plasmid. The integration of this plasmid seems to be a recent evolutionary event that could have implications in fungal biology. This sequence is available under GenBank accession number AY376688. (c) 2008 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Resumo:
Complete sequences were obtained for the coding portions of the mitochondrial (mt) genomes of Schistosoma mansoni (NMRI strain, Puerto Rico; 14415 bp), S. japonicum (Anhui strain, China; 14085 bp) and S. mekongi (Khong Island, Laos; 14072 bp). Each comprises 36 genes: 12 protein-encoding genes (cox1-3, nad1-6, nad4L, atp6 and cob); two ribosomal RNAs, rrnL (large subunit rRNA or 16S) and rrnS (small subunit rRNA or 12S); as well as 22 transfer RNA (tRNA) genes. The atp8 gene is absent. A large segment (9.6 kb) of the coding region (comprising 14 tRNAs, eight complete and two incomplete protein-encoding genes) for S. malayensis (Baling, Malaysian Peninsula) was also obtained. Each genome also possesses a long non-coding region that is divided into two parts (a small and a large non-coding region, the latter not fully sequenced in any species) by one or more tRNAs. The protein-encoding genes are similar in size, composition and codon usage in all species except for cox1 in S. mansoni (609 aa) and cox2 in S. mekongi (219 an), both of which are longer than homologues in other species. An unexpected finding in all the Schistosoma species was the presence of a leucine zipper motif in the nad4L gene. The gene order in S. mansoni is strikingly different from that seen in the S. japonicum group and other flatworms. There is a high level of identity (87-94% at both the nucleotide and amino acid levels) for all protein-encoding genes of S. mekongi and S. malayensis. The identity between genes of these two species and those of S. japonicum is less (56-83% for amino acids and 73-79 for nucleotides). The identity between the genes of S. mansoni and the Asian schistosomes is far less (33-66% for amino acids and 54-68% for nucleotides), an observation consistent with the known phylogenetic distance between S. mansoni and the other species. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
MOTIVATION: Lateral gene transfer is a major mechanism contributing to bacterial genome dynamics and pathovar emergence via pathogenicity island (PAI) spreading. However, since few of these genomic exchanges are experimentally reproducible, it is difficult to establish evolutionary scenarios for the successive PAI transmissions between bacterial genera. Methods initially developed at the gene and/or nucleotide level for genomics, i.e. comparisons of concatenated sequences, ortholog frequency, gene order or dinucleotide usage, were combined and applied here to homologous PAIs: we call this approach comparative PAI genometrics. RESULTS: YAPI, a Yersinia PAI, and related islands were compared with measure evolutionary relationships between related modules. Through use of our genometric approach designed for tracking codon usage adaptation and gene phylogeny, an ancient inter-genus PAI transfer was oriented for the first time by characterizing the genomic environment in which the ancestral island emerged and its subsequent transfers to other bacterial genera.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
We characterized four eEF1A genes in the alternative rhabditid nematode model organism Oscheius tipulae. This is twice the copy number of eEF1A genes in C. elegans, C. briggsae, and, probably, many other free-living and parasitic nematodes. The introns show features remarkably different from those of other metazoan eEF1A genes. Most of the introns in the eEF1A genes are specific to O. tipulae and are not shared with any of the other genes described in metazoans. Most of the introns are phase 0 (inserted between two codons), and few are inserted in protosplice sites (introns inserted between the nucleotide sequence A/CAG and G/A). Two of these phase 0 introns are conserved in sequence in two or more of the four eEF1A gene copies, and are inserted in the same position in the genes. Neither of these characteristics has been detected in any of the nematode eEF1A genes characterized to date. The coding sequences were also compared with other eEF1A cDNAs from 11 different nematodes to determine the variability of these genes within the phylum Nematoda. Parsimony and distance trees yielded similar topologies, which were similar to those created using other molecular markers. The presence of more than one copy of the eEF1A gene with nearly identical coding regions makes it difficult to define the orthologous cDNAs. As shown by our data on O. tipulae, careful and extensive examination of intron positions in the eEF1A gene across the phylum is necessary to define their potential for use as valid phylogenetic markers.