880 resultados para codon bias
Resumo:
Synonymous codon bias has been examined in 78 human genes (19967 codons) and measured by relative synonymous codon usage (RSCU). Relative frequencies of all kinds of dinucleotides in 2,3 or 3,4 codon positions have been calculated, and codon-anticodon bin
Resumo:
728 human genes were divided to four groups according to the GC contents of their coding sequences (from GC<0.43 to GC>0.58). Examination of synonymous-codon bias in the 4 groups show that NTG (N represents any base of T, A, C, G) is most favored and NCG
Resumo:
The small GTPases
(
(
Resumo:
The adaptor protein-2 sigma subunit (AP2sigma;2) is pivotal for clathrin-mediated endocytosis of plasma membrane constituents such as the calcium-sensing receptor (CaSR). Mutations of the AP2sigma;2 Arg15 residue result in familial hypocalciuric hypercalcaemia type 3 (FHH3), a disorder of extracellular calcium (Ca<inf>o</inf><sup>2+</sup>) homeostasis. To elucidate the role of AP2sigma;2 in Ca<inf>o</inf><sup>2+</sup> regulation, we investigated 65 FHH probands, without other FHH-associated mutations, for AP2sigma;2 mutations, characterized their functional consequences and investigated the genetic mechanisms leading to FHH3. AP2sigma;2 mutations were identified in 17 probands, comprising 5 Arg15Cys, 4 Arg15His and 8 Arg15Leu mutations. A genotype-phenotype correlation was observed with the Arg15Leu mutation leading to marked hypercalcaemia. FHH3 probands harboured additional phenotypes such as cognitive dysfunction. All three FHH3-causing AP2sigma;2 mutations impaired CaSR signal transduction in a dominant-negative manner. Mutational bias was observed at the AP2sigma;2 Arg15 residue as other predicted missense substitutions (Arg15Gly, Arg15Pro and Arg15Ser), which also caused CaSR loss-of-function, were not detected in FHH probands, and these mutations were found to reduce the numbers of CaSR-expressing cells. FHH3 probands had significantly greater serum calcium (sCa) and magnesium (sMg) concentrations with reduced urinary calcium to creatinine clearance ratios (CCCR) in comparison with FHH1 probands with CaSR mutations, and a calculated index of sCa × sMg/100 × CCCR, which was ≥ 5.0, had a diagnostic sensitivity and specificity of 83 and 86%, respectively, for FHH3. Thus, our studies demonstrate AP2sigma;2 mutations to result in a more severe FHH phenotype with genotype-phenotype correlations, and a dominant-negative mechanism of action with mutational bias at the Arg15 residue.
Resumo:
Different codons encoding the same amino acid are not used equally in protein-coding sequences. In bacteria, there is a bias towards codons with high translation rates. This bias is most pronounced in highly expressed proteins, but a recent study of synthetic GFP-coding sequences did not find a correlation between codon usage and GFP expression, suggesting that such correlation in natural sequences is not a simple property of translational mechanisms. Here, we investigate the effect of evolutionary forces on codon usage. The relation between codon bias and protein abundance is quantitatively analyzed based on the hypothesis that codon bias evolved to ensure the efficient usage of ribosomes, a precious commodity for fast growing cells. An explicit fitness landscape is formulated based on bacterial growth laws to relate protein abundance and ribosomal load. The model leads to a quantitative relation between codon bias and protein abundance, which accounts for a substantial part of the observed bias for E. coli. Moreover, by providing an evolutionary link, the ribosome load model resolves the apparent conflict between the observed relation of protein abundance and codon bias in natural sequences and the lack of such dependence in a synthetic gfp library. Finally, we show that the relation between codon usage and protein abundance can be used to predict protein abundance from genomic sequence data alone without adjustable parameters.
Resumo:
Different codons encoding the same amino acid are not used equally in protein-coding sequences. In bacteria, there is a bias towards codons with high translation rates. This bias is most pronounced in highly expressed proteins, but a recent study of synthetic GFP-coding sequences did not find a correlation between codon usage and GFP expression, suggesting that such correlation in natural sequences is not a simple property of translational mechanisms. Here, we investigate the effect of evolutionary forces on codon usage. The relation between codon bias and protein abundance is quantitatively analyzed based on the hypothesis that codon bias evolved to ensure the efficient usage of ribosomes, a precious commodity for fast growing cells. An explicit fitness landscape is formulated based on bacterial growth laws to relate protein abundance and ribosomal load. The model leads to a quantitative relation between codon bias and protein abundance, which accounts for a substantial part of the observed bias for E. coli. Moreover, by providing an evolutionary link, the ribosome load model resolves the apparent conflict between the observed relation of protein abundance and codon bias in natural sequences and the lack of such dependence in a synthetic gfp library. Finally, we show that the relation between codon usage and protein abundance can be used to predict protein abundance from genomic sequence data alone without adjustable parameters.
Resumo:
The informational properties of biological systems are the subject of much debate and research. I present a general argument in favor of the existence and central importance of information in organisms, followed by a case study of the genetic code (specifically, codon bias) and the translation system from the perspective of information. The codon biases of 831 Bacteria and Archeae are analyzed and modeled as points in a 64-dimensional statistical space. The major results are that (1) codon bias evolution does not follow canonical patterns, and (2) the use of coding space in organsims is a subset of the total possible coding space. These findings imply that codon bias is a unique adaptive mechanism that owes its existence to organisms' use of information in representing genes, and that there is a particularly biological character to the resulting biased coding and information use.
Resumo:
The psbA gene of the chloroplast genome has a codon usage that is unusual for plant chloroplast genes. In the present study the evolutionary status of this codon usage is tested by reconstructing putative ancestral psbA sequences to determine the pattern of change in codon bias during angiosperm divergence. It is shown that the codon biases of the ancestral genes are much stronger than all extant flowering plant psbA genes. This is related to previous work that demonstrated a significant increase in synonymous substitution in psbA relative to other chloroplast genes. It is suggested, based on the two lines of evidence, that the codon bias of this gene currently is not being maintained by selection. Rather, the atypical codon bias simply may be a remnant of an ancestral codon bias that now is being degraded by the mutation bias of the chloroplast genome, in other words, that the psbA gene is not at equilibrium. A model for the evolution of selective pressure on the codon usage of plant chloroplast genes is discussed.
Resumo:
We first review what is known about patterns of codon usage bias in Drosophila and make the following points: (i) Drosophila genes are as biased or more biased than those in microorganisms. (ii) The level of bias of genes and even the particular pattern of codon bias can remain phylogenetically invariant for very long periods of evolution. (iii) However, some genes, even very tightly linked genes, can change very greatly in codon bias across species. (iv) Generally G and especially C are favored at synonymous sites in biased genes. (v) With the exception of aspartic acid, all amino acids contribute significantly and about equally to the codon usage bias of a gene. (vi) While most individual amino acids that can use G or C at synonymous sites display a preference for C, there are exceptions: valine and leucine, which prefer G. (vii) Finally, smaller genes tend to be more biased than longer genes. We then examine possible causes of these patterns and discount mutation bias on three bases: there is little evidence of regional mutation bias in Drosophila, mutation bias is likely toward A+T (the opposite of codon usage bias), and not all amino acids display the preference for the same nucleotide in the wobble position. Two lines of evidence support a selection hypothesis based on tRNA pools: highly biased genes tend to be highly and/or rapidly expressed, and the preferred codons in highly biased genes optimally bind the most abundant isoaccepting tRNAs. Finally, we examine the effect of bias on DNA evolution and confirm that genes with high codon usage bias have lower rates of synonymous substitution between species than do genes with low codon usage bias. Surprisingly, we find that genes with higher codon usage bias display higher levels of intraspecific synonymous polymorphism. This may be due to opposing effects of recombination.
Resumo:
Here we report the codon bias and the mRNA secondary structural features of the hemagglutinin (HA) cleavage site basic amino acid regions of avian influenza virus H5N1 subtypes. We have developed a dynamic extended folding strategy to predict RNA secondar
Resumo:
The complete mitochondrial genome sequence of the Chinese hook snout carp, Opsariichthys bidens, was newly determined using the long and accurate polymerase chain reaction method. The 16,611-nucleotide mitogenome contains 13 protein-coding genes, two rRNA genes (12S, 16S) 22 tRNA genes, and a noncoding control region. We use these data and homologous sequence data from multiple other ostariophysan fishes in a phylogenetic evaluation to test hypothesis pertaining to codon usage pattern of O. bidens mitochondrial protein genes as well as to re-examine the ostariophysan phylogeny. The mitochondrial genome of O. bidens reveals an alternative pattern of vertebrate mitochondrial evolution. For the mitochondrial protein genes of O. bidens, the most frequently used codon generally ends with either A or C, with C preferred over A for most fourfold degenerate codon families; the relative synonymous codon usage of G-ending codons is greatly elevated in all categories. The codon usage pattern of O. bidens mitochondrial protein genes is remarkably different from the general pattern found previously in the relatively closely 9 related zebrafish and most other vertebrate mitochondria. Nucleotide bias at third codon positions is the main cause of codon bias in the mitochondrial protein genes of O. bidens, as it is biased particularly in favor of C over A. Bayesian analysis of 12 concatenated mitochondrial protein sequences for O. bidens and 46 other teleostean taxa supports the monophyly of Cypriniformes and Otophysi and results in a robust estimate of the otophysan phylogeny. (C) 2007 Published by Elsevier B.V.
Resumo:
The nucleotide sequence of a genomic DNA fragment thought previously to contain the dihydrofolate reductase gene (DFR1) of Saccharomyces cerevisiae by genetic criteria was determined. This DNA fragment of 1784' basepairs contains a large open reading frame from position 800 to 1432, which encodes a enzyme with a predicted molecular weight of 24,229.8 Daltons. Analysis of the amino acid sequence of this protein revealed that the yeast polypep·tide contained 211 amino acids, compared to the 186 residues commonly found in the polypeptides of other eukaryotes. The difference in size of the gene product can be attributed mainly to an insert in the yeast gene. Within this region, several consensus sequences required for processing of yeast nuclear and class II mitochondrial introns were identified, but appear not sufficient for the RNA splicing. The primary structure of the yeast DHFR protein has considerable sequence homology with analogous polypeptides from other organisms, especially in the consensus residues involved in cofactor and/or inhibitor binding. Analysis of the nucleotide sequence also revealed the presence of a number of canonical sequences identified in yeast as having some function in the regulation of gene expression. These include UAS elements (TGACTC) required for tIle amino acid general control response, and "TATA H boxes as well as several consensus sequences thought to be required for transcriptional termination and polyadenylation. Analysis of the codon usage of the yeast DFRl coding region revealed a codon bias index of 0.0083. this valve very close to zero suggestes 3 that the gene is expressed at a relatively low level under normal physiological conditions. The information concerning the organization of the DFRl were used to construct a variety of fusions of its 5' regulatory region with the coding region of the lacZ gene of E. coli. Some of such fused genes encoded a fusion product that expressed in E.coli and/or in yeast under the control of the 5' regulatory elements of the DFR1. Further studies with these fusion constructions revealed that the beta-galactosidase activity encoded on multicopy plasmids was stimulated transiently by prior exposure of yeast host cells to UV light. This suggests that the yeast PFRl gene is indu.ced by UV light and nlay in1ply a novel function of DHFR protein in the cellular responses to DNA damage. Another novel f~ature of yeast DHFR was revealed during preliminary studies of a diploid strain containing a heterozygous DFRl null allele. The strain was constructed by insertion of a URA3 gene within the coding region of DFR1. Sporulation of this diploid revealed that meiotic products segregated 2:0 for uracil prototrophy when spore clones were germinated on medium supplemented with 5-formyltetrahydrofolate (folinic acid). This finding suggests that, in addition to its catalytic activity, the DFRl gene product nlay play some role in the anabolisln of folinic acid. Alternatively, this result may indicate that Ura+ haploid segregants were inviable and suggest that the enzyme has an essential cellular function in this species.
Resumo:
Porcine S100A12 is a member of the S100 proteins, family of small acidic calcium-binding proteins characterized by the presence of two EF-hand motifs. These proteins are involved in many cellular events such as the regulation of protein phosphorylation, enzymatic activity, protein-protein interaction, Ca(2+) homeostasis, inflammatory processes and intermediate filament polymerization. In addition, members of this family bind Zn(2+) or Ca(2+) with cooperative effect on binding. In this study, the gene sequence encoding porcine S100A12 was obtained by the synthetic gene approach using E. coli codon bias. Additionally, we report a thermodynamic study of the recombinant S100A12 using circular dichroism, fluorescence and isothermal titration calorimetry. The results of urea and temperature induced unfolding and refolding processes indicated a reversible two-state process. Also, the ANS fluorescence studies showed that in presence of divalent ions the protein exposes hydrophobic sites which could facilitate the interaction with other proteins and trigger the physiological responses. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
A total of 3,631 expressed sequence tags (ESTs) were established from two size-selected cDNA libraries made from the tetrasporophytic phase of the agarophytic red alga Gracilaria tenuistipitata. The average sizes of the inserts in the two libraries were 1,600 bp and 600 bp, with an average length of the edited sequences of 850 bp. Clustering gave 2,387 assembled sequences with a redundancy of 53%. Of the ESTs, 65% had significant matches to sequences deposited in public databases, 11% to proteins without known function, and 35% were novel. The most represented ESTs were a Na/K-transporting ATPase, a hedgehog-like protein, a glycine dehydrogenase and an actin. Most of the identified genes were involved in primary metabolism and housekeeping. The largest functional group was thus genes involved in metabolism with 14% of the ESTs; other large functional categories included energy, transcription, and protein synthesis and destination. The codon usage was examined using a subset of the data, and the codon bias was found to be limited with all codon combinations used.
Resumo:
From the late 1980s, the automation of sequencing techniques and the computer spread gave rise to a flourishing number of new molecular structures and sequences and to proliferation of new databases in which to store them. Here are presented three computational approaches able to analyse the massive amount of publicly avalilable data in order to answer to important biological questions. The first strategy studies the incorrect assignment of the first AUG codon in a messenger RNA (mRNA), due to the incomplete determination of its 5' end sequence. An extension of the mRNA 5' coding region was identified in 477 in human loci, out of all human known mRNAs analysed, using an automated expressed sequence tag (EST)-based approach. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 and the consequences for the functional studies are discussed. The second approach analyses the codon bias, the phenomenon in which distinct synonymous codons are used with different frequencies, and, following integration with a gene expression profile, estimates the total number of codons present across all the expressed mRNAs (named here "codonome value") in a given biological condition. Systematic analyses across different pathological and normal human tissues and multiple species shows a surprisingly tight correlation between the codon bias and the codonome bias. The third approach is useful to studies the expression of human autism spectrum disorder (ASD) implicated genes. ASD implicated genes sharing microRNA response elements (MREs) for the same microRNA are co-expressed in brain samples from healthy and ASD affected individuals. The different expression of a recently identified long non coding RNA which have four MREs for the same microRNA could disrupt the equilibrium in this network, but further analyses and experiments are needed.