943 resultados para Peptide secondary structure
Resumo:
Cystic fibrosis is caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which encodes a chloride channel present in many cells. In cardiomyocytes, we report that multiple exon 1 usage and alternative splicing produces four CFTR transcripts, with different 5'-untranslated regions, CFTRTRAD-139, CFTR-1C/-1A, CFTR-1C, and CFTR-1B. CFTR transcripts containing the novel upstream exons (exons -1C, -1B, and -1A) represent more than 90% of cardiac expressed CFTR mRNA. Regulation of cardiac CFTR expression, in response to developmental and pathological stimuli, is exclusively due to the modulation of CFTR-1C and CFTR-1C/-1A expression. Upstream open reading frames have been identified in the 5'-untranslated regions of all CFTR transcripts that, in conjunction with adjacent stem-loop structures, modulate the efficiency of translation initiation at the AUG codon of the main CFTR coding region in CFTRTRAD-139 and CFTR-1C/-1A transcripts. Exon(-1A), only present in CFTR-1C/-1A transcripts, encodes an AUG codon that is in-frame with the main CFTR open reading frame, the efficient translation of which produces a novel CFTR protein isoform with a curtailed amino terminus. As the expression of this CFTR transcript parallels the spatial and temporal distribution of the cAMP-activated whole-cell current density in normal and diseased hearts, we suggest that CFTR-1C/-1A provides the molecular basis for the cardiac cAMP-activated chloride channel. Our findings provide further insight into the complex nature of in vivo CFTR expression, to which multiple mRNA transcripts, protein isoforms, and post-transcriptional regulatory mechanisms are now added.
Resumo:
Eukaryotic gene expression, reflected in the amount of steady-state mRNA, is regulated at the post-transcriptional level. The 5'-untranslated regions (5'-UTRs) of some transcripts contain cis-acting elements, including upstream open reading frames (uORFs), that have been identified as being fundamental in modulating translation efficiency and mRNA stability. Previously, we demonstrated that uORFs present in the 5'-UTR of cystic fibrosis transmembrane conductance regular (CFTR) transcripts expressed in the heart were able to modulate translation efficiency of the main CFTR ORF. Here, we show that the same 5'-UTR elements are associated with the differential stability of the 5'-UTR compared to the main coding region of CFTR transcripts. Furthermore, these post-transcriptional mechanisms are important factors governing regulated CFTR expression in the heart, in response to developmental and pathophysiological stimuli. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
The PotE protein is a putrescine-ornithine antiporter found in many gram-negative bacteria. It is a member of the APA family of transporters and has 12 predicted alpha-helical transmembrane spanning segments (TMS). While the substrate binding site has previously been mapped to a region near the surface of the cytoplasmic lipid layer, no structural feature within the periplasmic domains of PotE have been shown to be important for function. We examined the role of the only large outer loop, situated between transmembrane spanning segment 7 and 8, in putrescine uptake. Deletion of the highly conserved amino acids in the region closest to transmembrane spanning segment 7 produced a protein with little activity. Glycine-scanning mutagenesis of this region showed that Val(249) and Leu(254) were required for optimal transporter function. The V249G mutant transported putrescine at a lower maximal rate compared to wild-type (WT) but with the same substrate binding affinity. In contrast, the L254G mutant had a higher substrate affinity. A series of Val(249) mutants indicated that the hydrophobicity of this residue, which is located at or near the membrane surface, is important for PotE function. Secondary structure predictions of the large outer loop indicated the presence of a hydrophobic alpha-helix in the centre with a hydrophobic region at each end suggesting that the loop was not entirely exposed to the aqueous periplasmic space. The study shows that loop 7-8 is important for PotE function, possibly by forming a re-entrant loop in the channel of the transporter. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
To better understand the evolution of mitochondrial (mt) genomes in the Acari (mites and ticks), we sequenced the mt genome of the chigger mite, Leptotrombidium pallidum (Arthropoda: Acari: Acariformes). This genome is highly rearranged relative to that of the hypothetical ancestor of the arthropods and the other species of Acari studied. The mt genome of L. pallidum has two genes for large subunit rRNA, a pseudogene for small subunit rRNA, and four nearly identical large noncoding regions. Nineteen of the 22 tRNAs encoded by this genome apparently lack either a T-arm or a D-arm. Further, the mt genome of L. pallidum has two distantly separated sections with identical sequences but opposite orientations of transcription. This arrangement cannot be accounted for by homologous recombination or by previously known mechanisms of mt gene rearrangement. The most plausible explanation for the origin of this arrangement is illegitimate inter-mtDNA recombination, which has not been reported previously in animals. In light of the evidence from previous experiments on recombination in nuclear and mt genomes of animals, we propose a model of illegitimate inter-mtDNA recombination to account for the novel gene content and gene arrangement in the mt genome of L. pallidum.
Resumo:
In humans, a polymorphic gene encodes the drug-metabolizing enzyme NATI (arylamine N-acetyltransferase Type 1), which is widely expressed throughout the body. While the protein-coding region of NATI is contained within a single exon, examination of the human EST (expressed sequence tag) database at the NCBI revealed the presence of nine separate exons, eight of which were located in the 5'non-coding region of NATI. Differential splicing produced at least eight unique mRNA isoforms that could be grouped according to the location of the first exon, which suggested that NATI expression occurs from three alternative promoters. Using RT (reverse transcriptase)-PCR, we identified one major transcript in various epithelial cells derived from different tissues. In contrast, multiple transcripts were observed in blood-derived cell lines (CEM, THP-1 and Jurkat), with a novel variant, not identified in the EST database, found in CEM cells only. The major splice variant increased gene expression 9-11-fold in a luciferase reporter assay, while the other isoforrns were similar or slightly greater than the control. We examined the upstream region of the most active splice variant in a promoter-reporter assay, and isolated a 257 bp sequence that produced maximal promoter activity. This sequence lacked a TATA box, but contained a consensus Sp1 site and a CAAT box, as well as several other putative transcription-factor-binding sites. Cell-specific expression of the different NATI transcripts may contribute to the variation in NATI activity in vivo.
Resumo:
Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we used similar methods to identify ultraconserved genomic regions between the insect species Drosophila melanogaster and Drosophila pseudoobscura, as well as the more distantly related Anopheles gambiae. As with vertebrates, ultraconserved sequences in insects appear to Occur primarily in intergenic and intronic sequences, and at intron-exon junctions. The sequences are significantly associated with genes encoding developmental regulators and transcription factors, but are less frequent and are smaller in size than in vertebrates. The longest identical, nongapped orthologous match between the three genomes was found within the homothorax (hth) gene. This sequence spans an internal exon-intron junction, with the majority located within the intron, and is predicted to form a highly stable stem-loop RNA structure. Real-time quantitative PCR analysis of different hth splice isoforms and Northern blotting showed that the conserved element is associated with a high incidence of intron retention in hth pre-mRNA, suggesting that the conserved intronic element is critically important in the post-transcriptional regulation of hth expression in Diptera.
Resumo:
Insoluble expression of heterologous proteins in Escherichia coli is a major bottleneck of many structural genomics and high-throughput protein biochemistry projects. Many of these proteins may be amenable to refolding, but their identification is hampered by a lack of high-throughput methods. We have developed a matrix-assisted refolding approach in which correctly folded proteins are distinguished from misfolded proteins by their elution from affinity resin under nondenaturing conditions. Misfolded proteins remain adhered to the resin, presumably via hydrophobic interactions. The assay can be applied to insoluble proteins on an individual basis but is particularly well suited for high-throughput applications because it is rapid, automatable and has no rigorous sample preparation requirements. The efficacy of the screen is demonstrated on small-scale expression samples for 15 proteins. Refolding is then validated by large-scale expressions using SEC and circular dichroism.
Resumo:
Alfuy virus (ALFV) is classified as a subtype of the flavivirus Murray Valley encephalitis virus (MVEV); however, despite preliminary reports of antigenic and ecological similarities with MVEV, ALFV has not been associated with human disease. Here, it was shown that ALFV is at least 10(4)-fold less neuroinvasive than MVEV after peripheral inoculation of 3-week-old Swiss outbred mice, but ALFV demonstrates similar neurovirulence. In addition, it was shown that ALFV is partially attenuated in mice that are deficient in alpha/beta interferon responses, in contrast to MVEV which is uniformly lethal in these mice. To assess the antigenic relationship between these viruses, a panel of monoclonal antibodies was tested for the ability to bind to ALFV and MVEV in ELISA. Although the majority of monoclonal antibodies recognized both viruses, confirming their antigenic similarity, several discriminating antibodies were identified. Finally, the entire genome of the prototype strain of ALFV (MRM3929) was sequenced and phylogenetically analysed. Nucleotide (73%) and amino acid sequence (83 %) identity between ALFV and IMVEV confirmed previous reports of their close relationship. Several nucleotide and amino acid deletions and/or substitutions with putative functional significance were identified in ALFV, including the abolition of a conserved glycosylation site in the envelope protein and the deletion of the terminal dinucleotide 5'-CUOH-3' found in all other members of the genus. These findings confirm previous reports that ALFV is closely related to IMVEV, but also highlights significant antigenic, genetic and phenotypic divergence from MVEV. Accordingly, the data suggest that ALFV is a distinct species within the serogroup Japanese encephalitis virus.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Conotoxins are small conformationally constrained peptides found in the venom of marine snails of the genus Conus. They are usually cysteine rich and frequently contain a high degree of post-translational modifications such as C-terminal amidation, hydroxylation, carboxylation, bromination, epimerisation and glycosylation. Here we review the role of NMR in determining the three-dimensional structures of conotoxins and also provide a compilation and analysis of H-1 and C-13 chemical shifts of post-translationally modified amino acids and compare them with data from common amino acids. This analysis provides a reference source for chemical shifts of post-translationally modified amino acids. Copyright (C) 2006 John Wiley & Sons, Ltd.
Resumo:
Purple acid phosphatases are a family of binuclear metallohydrolases that have been identified in plants, animals and fungi. Only one isoform of similar to 35 kDa has been isolated from animals, where it is associated with bone resorption and microbial killing through its phosphatase activity, and hydroxyl radical production, respectively. Using the sensitive PSI-BLAST search method, sequences representing new purple acid phosphatase-like proteins have been identified in mammals, insects and nematodes. These new putative isoforms are closely related to the similar to 55 kDa purple acid phosphatase characterized from plants. Secondary structure prediction of the new human isoform further confirms its similarity to a purple acid phosphatase from the red kidney bean. A structural model for the human enzyme was constructed based on the red kidney bean purple acid phosphatase structure. This model shows that the catalytic centre observed in other purple acid phosphatases is also present in this new isoform. These observations suggest that the sequences identified in this study represent a novel subfamily of plant-like purple acid phosphatases in animals and humans. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Hydrophobins are small (similar to 100 aa) proteins that have an important role in the growth and development of mycelial fungi. They are surface active and, after secretion by the fungi, self-assemble into amphipathic membranes at hydrophobic/hydrophilic interfaces, reversing the hydrophobicity of the surface. In this study, molecular dynamics simulation techniques have been used to model the process by which a specific class I hydrophobin, SC3, binds to a range of hydrophobic/ hydrophilic interfaces. The structure of SC3 used in this investigation was modeled based on the crystal structure of the class II hydrophobin HFBII using the assumption that the disulfide pairings of the eight conserved cysteine residues are maintained. The proposed model for SC3 in aqueous solution is compact and globular containing primarily P-strand and coil structures. The behavior of this model of SC3 was investigated at an air/water, an oil/water, and a hydrophobic solid/water interface. It was found that SC3 preferentially binds to the interfaces via the loop region between the third and fourth cysteine residues and that binding is associated with an increase in a-helix formation in qualitative agreement with experiment. Based on a combination of the available experiment data and the current simulation studies, we propose a possible model for SC3 self-assembly on a hydrophobic solid/water interface.