923 resultados para Secondary Structure
Resumo:
In humans, a polymorphic gene encodes the drug-metabolizing enzyme NATI (arylamine N-acetyltransferase Type 1), which is widely expressed throughout the body. While the protein-coding region of NATI is contained within a single exon, examination of the human EST (expressed sequence tag) database at the NCBI revealed the presence of nine separate exons, eight of which were located in the 5'non-coding region of NATI. Differential splicing produced at least eight unique mRNA isoforms that could be grouped according to the location of the first exon, which suggested that NATI expression occurs from three alternative promoters. Using RT (reverse transcriptase)-PCR, we identified one major transcript in various epithelial cells derived from different tissues. In contrast, multiple transcripts were observed in blood-derived cell lines (CEM, THP-1 and Jurkat), with a novel variant, not identified in the EST database, found in CEM cells only. The major splice variant increased gene expression 9-11-fold in a luciferase reporter assay, while the other isoforrns were similar or slightly greater than the control. We examined the upstream region of the most active splice variant in a promoter-reporter assay, and isolated a 257 bp sequence that produced maximal promoter activity. This sequence lacked a TATA box, but contained a consensus Sp1 site and a CAAT box, as well as several other putative transcription-factor-binding sites. Cell-specific expression of the different NATI transcripts may contribute to the variation in NATI activity in vivo.
Resumo:
Cyclic pentapepticles are not known to exist in a-helical conformations. CD and NMR spectra show that specific 20-membered cyclic pentapepticles, Ac-(cyclo-1,5) [KxxxD]-NH2 and Ac-(cyclo-2,6)R[KxxxD]-NH2, are highly a-helical structures in water and independent of concentration, TFE, denaturants, and proteases. These are the smallest a-helical peptides in water.
Resumo:
Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we used similar methods to identify ultraconserved genomic regions between the insect species Drosophila melanogaster and Drosophila pseudoobscura, as well as the more distantly related Anopheles gambiae. As with vertebrates, ultraconserved sequences in insects appear to Occur primarily in intergenic and intronic sequences, and at intron-exon junctions. The sequences are significantly associated with genes encoding developmental regulators and transcription factors, but are less frequent and are smaller in size than in vertebrates. The longest identical, nongapped orthologous match between the three genomes was found within the homothorax (hth) gene. This sequence spans an internal exon-intron junction, with the majority located within the intron, and is predicted to form a highly stable stem-loop RNA structure. Real-time quantitative PCR analysis of different hth splice isoforms and Northern blotting showed that the conserved element is associated with a high incidence of intron retention in hth pre-mRNA, suggesting that the conserved intronic element is critically important in the post-transcriptional regulation of hth expression in Diptera.
Resumo:
Insoluble expression of heterologous proteins in Escherichia coli is a major bottleneck of many structural genomics and high-throughput protein biochemistry projects. Many of these proteins may be amenable to refolding, but their identification is hampered by a lack of high-throughput methods. We have developed a matrix-assisted refolding approach in which correctly folded proteins are distinguished from misfolded proteins by their elution from affinity resin under nondenaturing conditions. Misfolded proteins remain adhered to the resin, presumably via hydrophobic interactions. The assay can be applied to insoluble proteins on an individual basis but is particularly well suited for high-throughput applications because it is rapid, automatable and has no rigorous sample preparation requirements. The efficacy of the screen is demonstrated on small-scale expression samples for 15 proteins. Refolding is then validated by large-scale expressions using SEC and circular dichroism.
Resumo:
Alfuy virus (ALFV) is classified as a subtype of the flavivirus Murray Valley encephalitis virus (MVEV); however, despite preliminary reports of antigenic and ecological similarities with MVEV, ALFV has not been associated with human disease. Here, it was shown that ALFV is at least 10(4)-fold less neuroinvasive than MVEV after peripheral inoculation of 3-week-old Swiss outbred mice, but ALFV demonstrates similar neurovirulence. In addition, it was shown that ALFV is partially attenuated in mice that are deficient in alpha/beta interferon responses, in contrast to MVEV which is uniformly lethal in these mice. To assess the antigenic relationship between these viruses, a panel of monoclonal antibodies was tested for the ability to bind to ALFV and MVEV in ELISA. Although the majority of monoclonal antibodies recognized both viruses, confirming their antigenic similarity, several discriminating antibodies were identified. Finally, the entire genome of the prototype strain of ALFV (MRM3929) was sequenced and phylogenetically analysed. Nucleotide (73%) and amino acid sequence (83 %) identity between ALFV and IMVEV confirmed previous reports of their close relationship. Several nucleotide and amino acid deletions and/or substitutions with putative functional significance were identified in ALFV, including the abolition of a conserved glycosylation site in the envelope protein and the deletion of the terminal dinucleotide 5'-CUOH-3' found in all other members of the genus. These findings confirm previous reports that ALFV is closely related to IMVEV, but also highlights significant antigenic, genetic and phenotypic divergence from MVEV. Accordingly, the data suggest that ALFV is a distinct species within the serogroup Japanese encephalitis virus.
Resumo:
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
A 13-residue peptide sequence from a respiratory syncitial virus fusion protein was constrained in an alpha-helical conformation by fusing two back-to-back cyclic alpha-turn mimetics. The resulting peptide, Ac-(3 -> 7; 8 -> 12)-bicyclo-FP[KDEFD][KSIRD]V-NH2, was highly alpha-helical in water by CD and NMR spectroscopy, correctly positioning crucial binding residues (F488, I491, V493) on one face of the helix and side chain-side chain linkers on a noninteracting face of the helix. This compound displayed potent activity in both a recombinant fusion assay and an RSV antiviral assay (IC50 = 36 nM) and demonstrates for the first time that back-to-back modular alpha-helix mimetics can produce functional antagonists of important protein-protein interactions.
Resumo:
Conotoxins are small conformationally constrained peptides found in the venom of marine snails of the genus Conus. They are usually cysteine rich and frequently contain a high degree of post-translational modifications such as C-terminal amidation, hydroxylation, carboxylation, bromination, epimerisation and glycosylation. Here we review the role of NMR in determining the three-dimensional structures of conotoxins and also provide a compilation and analysis of H-1 and C-13 chemical shifts of post-translationally modified amino acids and compare them with data from common amino acids. This analysis provides a reference source for chemical shifts of post-translationally modified amino acids. Copyright (C) 2006 John Wiley & Sons, Ltd.
Resumo:
Purple acid phosphatases are a family of binuclear metallohydrolases that have been identified in plants, animals and fungi. Only one isoform of similar to 35 kDa has been isolated from animals, where it is associated with bone resorption and microbial killing through its phosphatase activity, and hydroxyl radical production, respectively. Using the sensitive PSI-BLAST search method, sequences representing new purple acid phosphatase-like proteins have been identified in mammals, insects and nematodes. These new putative isoforms are closely related to the similar to 55 kDa purple acid phosphatase characterized from plants. Secondary structure prediction of the new human isoform further confirms its similarity to a purple acid phosphatase from the red kidney bean. A structural model for the human enzyme was constructed based on the red kidney bean purple acid phosphatase structure. This model shows that the catalytic centre observed in other purple acid phosphatases is also present in this new isoform. These observations suggest that the sequences identified in this study represent a novel subfamily of plant-like purple acid phosphatases in animals and humans. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Hydrophobins are small (similar to 100 aa) proteins that have an important role in the growth and development of mycelial fungi. They are surface active and, after secretion by the fungi, self-assemble into amphipathic membranes at hydrophobic/hydrophilic interfaces, reversing the hydrophobicity of the surface. In this study, molecular dynamics simulation techniques have been used to model the process by which a specific class I hydrophobin, SC3, binds to a range of hydrophobic/ hydrophilic interfaces. The structure of SC3 used in this investigation was modeled based on the crystal structure of the class II hydrophobin HFBII using the assumption that the disulfide pairings of the eight conserved cysteine residues are maintained. The proposed model for SC3 in aqueous solution is compact and globular containing primarily P-strand and coil structures. The behavior of this model of SC3 was investigated at an air/water, an oil/water, and a hydrophobic solid/water interface. It was found that SC3 preferentially binds to the interfaces via the loop region between the third and fourth cysteine residues and that binding is associated with an increase in a-helix formation in qualitative agreement with experiment. Based on a combination of the available experiment data and the current simulation studies, we propose a possible model for SC3 self-assembly on a hydrophobic solid/water interface.
Resumo:
Heterogeneous nuclear ribonucleoprotein (hnRNP) A2 is a multitasking protein involved in RNA packaging, alternative splicing of pre-mRNA. telomere maintenance, cytoplasmic RNA trafficking, and translation. It binds short segments of single-stranded nucleic acids, including the A2RE11 RNA element that is necessary and sufficient for cytoplasmic transport of a subset of rnRNAs in oligodendrocytes and neurons. We have explored the structures of hnRNP A2, its RNA recognition motifs (RRMs) and Gly-rich module, and the RRM complexes with A2RE11. Circular dichroism spectroscopy showed that the secondary structure of the first 189 residues of hnRNP A2 parallels that of the tandem beta alpha beta beta alpha beta RRMs of its paralogue, hnRNP A1, previously deduced from X-ray diffraction studies. The unusual GRD was shown to have substantial beta-sheet and beta-turn structure. Sedimentation equilibrium and circular dichroism results were consistent with the tandem RRM region being monomeric and supported earlier evidence for the binding of two A2RE11 oligoribonucleotides to this domain, in contrast to the protein dimer formed by the complex of hnRNP A1 with the telomeric ssDNA repeat. A three-dimensional structure for the N-terminal, two-RRM-containing segment of hnRNP A2 was derived by homology modeling. This structure was used to derive a model for the complex with A2RE11 using the previously described interaction of pairs of stacked nucleotides with aromatic residues on the RRM beta-sheet platforms, conserved in other RRM-RNA complexes, together with biochemical data and molecular dynamics-based observations of inter-RRM mobility.
Resumo:
Kunjin virus is a member of the Flavivirus genus and is an Australian variant of West Nile virus. The C-terminal domain of the Kunjin virus NS3 protein displays helicase activity. The protein is thought to separate daughter and template RNA strands, assisting the initiation of replication by unwinding RNA secondary structure in the 3' nontranslated region. Expression, purification and preliminary crystallographic characterization of the NS3 helicase domain are reported. It is shown that Kunjin virus helicase may adopt a dimeric assembly in absence of nucleic acids, oligomerization being a means to provide the helicases with multiple nucleic acid-binding capability, facilitating translocation along the RNA strands. Kunjin virus NS3 helicase domain is an attractive model for studying the molecular mechanisms of flavivirus replication, while simultaneously providing a new basis for the rational development of anti-flaviviral compounds.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.