927 resultados para secondary structure detection
Resumo:
Cystic fibrosis is caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which encodes a chloride channel present in many cells. In cardiomyocytes, we report that multiple exon 1 usage and alternative splicing produces four CFTR transcripts, with different 5'-untranslated regions, CFTRTRAD-139, CFTR-1C/-1A, CFTR-1C, and CFTR-1B. CFTR transcripts containing the novel upstream exons (exons -1C, -1B, and -1A) represent more than 90% of cardiac expressed CFTR mRNA. Regulation of cardiac CFTR expression, in response to developmental and pathological stimuli, is exclusively due to the modulation of CFTR-1C and CFTR-1C/-1A expression. Upstream open reading frames have been identified in the 5'-untranslated regions of all CFTR transcripts that, in conjunction with adjacent stem-loop structures, modulate the efficiency of translation initiation at the AUG codon of the main CFTR coding region in CFTRTRAD-139 and CFTR-1C/-1A transcripts. Exon(-1A), only present in CFTR-1C/-1A transcripts, encodes an AUG codon that is in-frame with the main CFTR open reading frame, the efficient translation of which produces a novel CFTR protein isoform with a curtailed amino terminus. As the expression of this CFTR transcript parallels the spatial and temporal distribution of the cAMP-activated whole-cell current density in normal and diseased hearts, we suggest that CFTR-1C/-1A provides the molecular basis for the cardiac cAMP-activated chloride channel. Our findings provide further insight into the complex nature of in vivo CFTR expression, to which multiple mRNA transcripts, protein isoforms, and post-transcriptional regulatory mechanisms are now added.
Resumo:
Eukaryotic gene expression, reflected in the amount of steady-state mRNA, is regulated at the post-transcriptional level. The 5'-untranslated regions (5'-UTRs) of some transcripts contain cis-acting elements, including upstream open reading frames (uORFs), that have been identified as being fundamental in modulating translation efficiency and mRNA stability. Previously, we demonstrated that uORFs present in the 5'-UTR of cystic fibrosis transmembrane conductance regular (CFTR) transcripts expressed in the heart were able to modulate translation efficiency of the main CFTR ORF. Here, we show that the same 5'-UTR elements are associated with the differential stability of the 5'-UTR compared to the main coding region of CFTR transcripts. Furthermore, these post-transcriptional mechanisms are important factors governing regulated CFTR expression in the heart, in response to developmental and pathophysiological stimuli. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
The PotE protein is a putrescine-ornithine antiporter found in many gram-negative bacteria. It is a member of the APA family of transporters and has 12 predicted alpha-helical transmembrane spanning segments (TMS). While the substrate binding site has previously been mapped to a region near the surface of the cytoplasmic lipid layer, no structural feature within the periplasmic domains of PotE have been shown to be important for function. We examined the role of the only large outer loop, situated between transmembrane spanning segment 7 and 8, in putrescine uptake. Deletion of the highly conserved amino acids in the region closest to transmembrane spanning segment 7 produced a protein with little activity. Glycine-scanning mutagenesis of this region showed that Val(249) and Leu(254) were required for optimal transporter function. The V249G mutant transported putrescine at a lower maximal rate compared to wild-type (WT) but with the same substrate binding affinity. In contrast, the L254G mutant had a higher substrate affinity. A series of Val(249) mutants indicated that the hydrophobicity of this residue, which is located at or near the membrane surface, is important for PotE function. Secondary structure predictions of the large outer loop indicated the presence of a hydrophobic alpha-helix in the centre with a hydrophobic region at each end suggesting that the loop was not entirely exposed to the aqueous periplasmic space. The study shows that loop 7-8 is important for PotE function, possibly by forming a re-entrant loop in the channel of the transporter. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
To better understand the evolution of mitochondrial (mt) genomes in the Acari (mites and ticks), we sequenced the mt genome of the chigger mite, Leptotrombidium pallidum (Arthropoda: Acari: Acariformes). This genome is highly rearranged relative to that of the hypothetical ancestor of the arthropods and the other species of Acari studied. The mt genome of L. pallidum has two genes for large subunit rRNA, a pseudogene for small subunit rRNA, and four nearly identical large noncoding regions. Nineteen of the 22 tRNAs encoded by this genome apparently lack either a T-arm or a D-arm. Further, the mt genome of L. pallidum has two distantly separated sections with identical sequences but opposite orientations of transcription. This arrangement cannot be accounted for by homologous recombination or by previously known mechanisms of mt gene rearrangement. The most plausible explanation for the origin of this arrangement is illegitimate inter-mtDNA recombination, which has not been reported previously in animals. In light of the evidence from previous experiments on recombination in nuclear and mt genomes of animals, we propose a model of illegitimate inter-mtDNA recombination to account for the novel gene content and gene arrangement in the mt genome of L. pallidum.
Resumo:
In humans, a polymorphic gene encodes the drug-metabolizing enzyme NATI (arylamine N-acetyltransferase Type 1), which is widely expressed throughout the body. While the protein-coding region of NATI is contained within a single exon, examination of the human EST (expressed sequence tag) database at the NCBI revealed the presence of nine separate exons, eight of which were located in the 5'non-coding region of NATI. Differential splicing produced at least eight unique mRNA isoforms that could be grouped according to the location of the first exon, which suggested that NATI expression occurs from three alternative promoters. Using RT (reverse transcriptase)-PCR, we identified one major transcript in various epithelial cells derived from different tissues. In contrast, multiple transcripts were observed in blood-derived cell lines (CEM, THP-1 and Jurkat), with a novel variant, not identified in the EST database, found in CEM cells only. The major splice variant increased gene expression 9-11-fold in a luciferase reporter assay, while the other isoforrns were similar or slightly greater than the control. We examined the upstream region of the most active splice variant in a promoter-reporter assay, and isolated a 257 bp sequence that produced maximal promoter activity. This sequence lacked a TATA box, but contained a consensus Sp1 site and a CAAT box, as well as several other putative transcription-factor-binding sites. Cell-specific expression of the different NATI transcripts may contribute to the variation in NATI activity in vivo.
Resumo:
Cyclic pentapepticles are not known to exist in a-helical conformations. CD and NMR spectra show that specific 20-membered cyclic pentapepticles, Ac-(cyclo-1,5) [KxxxD]-NH2 and Ac-(cyclo-2,6)R[KxxxD]-NH2, are highly a-helical structures in water and independent of concentration, TFE, denaturants, and proteases. These are the smallest a-helical peptides in water.
Resumo:
Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we used similar methods to identify ultraconserved genomic regions between the insect species Drosophila melanogaster and Drosophila pseudoobscura, as well as the more distantly related Anopheles gambiae. As with vertebrates, ultraconserved sequences in insects appear to Occur primarily in intergenic and intronic sequences, and at intron-exon junctions. The sequences are significantly associated with genes encoding developmental regulators and transcription factors, but are less frequent and are smaller in size than in vertebrates. The longest identical, nongapped orthologous match between the three genomes was found within the homothorax (hth) gene. This sequence spans an internal exon-intron junction, with the majority located within the intron, and is predicted to form a highly stable stem-loop RNA structure. Real-time quantitative PCR analysis of different hth splice isoforms and Northern blotting showed that the conserved element is associated with a high incidence of intron retention in hth pre-mRNA, suggesting that the conserved intronic element is critically important in the post-transcriptional regulation of hth expression in Diptera.
Resumo:
Insoluble expression of heterologous proteins in Escherichia coli is a major bottleneck of many structural genomics and high-throughput protein biochemistry projects. Many of these proteins may be amenable to refolding, but their identification is hampered by a lack of high-throughput methods. We have developed a matrix-assisted refolding approach in which correctly folded proteins are distinguished from misfolded proteins by their elution from affinity resin under nondenaturing conditions. Misfolded proteins remain adhered to the resin, presumably via hydrophobic interactions. The assay can be applied to insoluble proteins on an individual basis but is particularly well suited for high-throughput applications because it is rapid, automatable and has no rigorous sample preparation requirements. The efficacy of the screen is demonstrated on small-scale expression samples for 15 proteins. Refolding is then validated by large-scale expressions using SEC and circular dichroism.
Resumo:
Alfuy virus (ALFV) is classified as a subtype of the flavivirus Murray Valley encephalitis virus (MVEV); however, despite preliminary reports of antigenic and ecological similarities with MVEV, ALFV has not been associated with human disease. Here, it was shown that ALFV is at least 10(4)-fold less neuroinvasive than MVEV after peripheral inoculation of 3-week-old Swiss outbred mice, but ALFV demonstrates similar neurovirulence. In addition, it was shown that ALFV is partially attenuated in mice that are deficient in alpha/beta interferon responses, in contrast to MVEV which is uniformly lethal in these mice. To assess the antigenic relationship between these viruses, a panel of monoclonal antibodies was tested for the ability to bind to ALFV and MVEV in ELISA. Although the majority of monoclonal antibodies recognized both viruses, confirming their antigenic similarity, several discriminating antibodies were identified. Finally, the entire genome of the prototype strain of ALFV (MRM3929) was sequenced and phylogenetically analysed. Nucleotide (73%) and amino acid sequence (83 %) identity between ALFV and IMVEV confirmed previous reports of their close relationship. Several nucleotide and amino acid deletions and/or substitutions with putative functional significance were identified in ALFV, including the abolition of a conserved glycosylation site in the envelope protein and the deletion of the terminal dinucleotide 5'-CUOH-3' found in all other members of the genus. These findings confirm previous reports that ALFV is closely related to IMVEV, but also highlights significant antigenic, genetic and phenotypic divergence from MVEV. Accordingly, the data suggest that ALFV is a distinct species within the serogroup Japanese encephalitis virus.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
A 13-residue peptide sequence from a respiratory syncitial virus fusion protein was constrained in an alpha-helical conformation by fusing two back-to-back cyclic alpha-turn mimetics. The resulting peptide, Ac-(3 -> 7; 8 -> 12)-bicyclo-FP[KDEFD][KSIRD]V-NH2, was highly alpha-helical in water by CD and NMR spectroscopy, correctly positioning crucial binding residues (F488, I491, V493) on one face of the helix and side chain-side chain linkers on a noninteracting face of the helix. This compound displayed potent activity in both a recombinant fusion assay and an RSV antiviral assay (IC50 = 36 nM) and demonstrates for the first time that back-to-back modular alpha-helix mimetics can produce functional antagonists of important protein-protein interactions.