28 resultados para protein sequence classification

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (±20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: The amino terminal half of the cellular prion protein PrPc is implicated in both the binding of copper ions and the conformational changes that lead to disease but has no defined structure. However, as some structure is likely to exist we have investigated the use of an established protein refolding technology, fusion to green fluorescence protein (GFP), as a method to examine the refolding of the amino terminal domain of mouse prion protein. Results: Fusion proteins of PrPc and GFP were expressed at high level in E. coli and could be purified to near homogeneity as insoluble inclusion bodies. Following denaturation, proteins were diluted into a refolding buffer whereupon GFP fluorescence recovered with time. Using several truncations of PrPc the rate of refolding was shown to depend on the prion sequence expressed. In a variation of the format, direct observation in E. coli, mutations introduced randomly in the PrPc protein sequence that affected folding could be selected directly by recovery of GFP fluorescence. Conclusion: Use of GFP as a measure of refolding of PrPc fusion proteins in vitro and in vivo proved informative. Refolding in vitro suggested a local structure within the amino terminal domain while direct selection via fluorescence showed that as little as one amino acid change could significantly alter folding. These assay formats, not previously used to study PrP folding, may be generally useful for investigating PrPc structure and PrPc-ligand interaction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We show that most isolates of influenza A induce filamentous changes in infected cells in contrast to A/WSN/33 and A/PR8/34 strains which have undergone extensive laboratory passage and are mouse-adapted. Using reverse genetics, we created recombinant viruses in the naturally filamentous genetic background of A/Victoria/3/75 and established that this property is regulated by the M1 protein sequence, but that the phenotype is complex and several residues are involved. The filamentous phenotype was lost when the amino acid at position 41 was switched from A to V, at the same time, this recombinant virus also became insensitive to the antibody 14C2. On the other hand, the filamentous phenotype could be fully transferred to a virus containing RNA segment 7 of the A/WSN/33 virus by a combination of three mutations in both the amino and carboxy regions of the M1 protein. This observation suggests that an interaction among these regions of M1 may occur during assembly. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dynamically disordered regions appear to be relatively abundant in eukaryotic proteomes. The DISOPRED server allows users to submit a protein sequence, and returns a probability estimate of each residue in the sequence being disordered. The results are sent in both plain text and graphical formats, and the server can also supply predictions of secondary structure to provide further structural information.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The PSIPRED protein structure prediction server allows users to submit a protein sequence, perform a prediction of their choice and receive the results of the prediction both textually via e-mail and graphically via the web. The user may select one of three prediction methods to apply to their sequence: PSIPRED, a highly accurate secondary structure prediction method; MEMSAT 2, a new version of a widely used transmembrane topology prediction method; or GenTHREADER, a sequence profile based fold recognition method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Eicosanoids are biologically active, oxygenated metabolites of three C20 polyunsaturated fatty acids. They act as signalling molecules within the autocrine or paracrine system in both vertebrates and invertebrates mainly functioning as important mediators in reproduction, the immune system and ion transport. The biosynthesis of eicosanoids has been intensively studied in mammals and it is known that they are synthesised from the fatty acid, arachidonic acid, through either the cyclooxygenase (COX) pathway; the lipoxygenase (LOX) pathway; or the cytochrome P450 epoxygenase pathway. However, little is still known about the synthesis and structure of the pathway in invertebrates. Results: Here, we show transcriptomic evidence from Daphnia magna (Crustacea: Branchiopoda) together with a bioinformatic analysis of the D. pulex genome providing insight on the role of eicosanoids in these crustaceans as well as outlining a putative pathway of eicosanoid biosynthesis. Daphnia appear only to have one copy of the gene encoding the key enzyme COX, and phylogenetic analysis reveals that the predicted protein sequence of Daphnia COX clusters with other invertebrates. There is no current evidence of an epoxygenase pathway in Daphnia; however, LOX products are most certainly synthesised in daphnids. Conclusion: We have outlined the structure of eicosanoid biosynthesis in Daphnia, a key genus in freshwater ecosystems. Improved knowledge of the function and synthesis of eicosanoids in Daphnia and other invertebrates could have important implications for several areas within ecology. This provisional overview of daphnid eicosanoid biosynthesis provides a guide on where to focus future research activities in this area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recently described cupin superfamily of proteins includes the germin and germinlike proteins, of which the cereal oxalate oxidase is the best characterized. This superfamily also includes seed storage proteins, in addition to several microbial enzymes and proteins with unknown function. All these proteins are characterized by the conservation of two central motifs, usually containing two or three histidine residues presumed to be involved with metal binding in the catalytic active site. The present study on the coding regions of Synechocystis PCC6803 identifies a previously unknown group of 12 related cupins, each containing the characteristic two-motif signature. This group comprises 11 single-domain proteins, ranging in length from 104 to 289 residues, and includes two phosphomannose isomerases and two epimerases involved in cell wall synthesis, a member of the pirin group of nuclear proteins, a possible transcriptional regulator, and a close relative-of a cytochrome c551 from Rhodococcus. Additionally, there is a duplicated, two-domain protein that has close similarity to an oxalate decarboxylase from the fungus Collybia velutipes and that is a putative progenitor of the storage proteins of land plants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Intrinsic protein disorder is functionally implicated in numerous biological roles and is, therefore, ubiquitous in proteins from all three kingdoms of life. Determining the disordered regions in proteins presents a challenge for experimental methods and so recently there has been much focus on the development of improved predictive methods. In this article, a novel technique for disorder prediction, called DISOclust, is described, which is based on the analysis of multiple protein fold recognition models. The DISOclust method is rigorously benchmarked against the top.ve methods from the CASP7 experiment. In addition, the optimal consensus of the tested methods is determined and the added value from each method is quantified. Results: The DISOclust method is shown to add the most value to a simple consensus of methods, even in the absence of target sequence homology to known structures. A simple consensus of methods that includes DISOclust can significantly outperform all of the previous individual methods tested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The hepatitis C virus (HCV) non-structural 5A protein (NS5A) contains a highly conserved C-terminal polyproline motif with the consensus sequence Pro-X-X- Pro-X-Arg that is able to interact with the Src-homology 3 (SH3) domains of a variety of cellular proteins. Results: To understand this interaction in more detail we have expressed two N-terminally truncated forms of NS5A in E. coli and examined their interactions with the SH3 domain of the Src-family tyrosine kinase, Fyn. Surface plasmon resonance analysis revealed that NS5A binds to the Fyn SH3 domain with what can be considered a high affinity SH3 domain-ligand interaction (629 nM), and this binding did not require the presence of domain I of NS5A (amino acid residues 32-250). Mutagenic analysis of the Fyn SH3 domain demonstrated the requirement for an acidic cluster at the C-terminus of the RT-Src loop of the SH3 domain, as well as several highly conserved residues previously shown to participate in SH3 domain peptide binding. Conclusion: We conclude that the NS5A: Fyn SH3 domain interaction occurs via a canonical SH3 domain binding site and the high affinity of the interaction suggests that NS5A would be able to compete with cognate Fyn ligands within the infected cell.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Proteins are commonly identified through enzymatic digestion and generation of short sequence tags or fingerprints of peptide masses by mass spectrometry. Separation methods, such as liquid chromatography and electrophoresis, are often used to fractionate complex protein or peptide mixtures and these separations also provide information on the different species, such as molecular weight and isoelectric point from electrophoresis and hydrophobicity in reversed-phase chromatography. These are also properties that can be predicted from amino acid sequences derived from genomic sequences and used in protein identification. This chapter reviews recently introduced methods based on retention time prediction to extract information from chromatographic separations and the applications to protein identification in organisms with small and large genomes. Novel data on retention time prediction of posttranslationally modified peptides is also presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Severe acute respiratory syndrome (SARS) coronavirus (SCoV) spike (S) protein is the major surface antigen of the virus and is responsible for receptor binding and the generation of neutralizing antibody. To investigate SCoV S protein, full-length and individual domains of S protein were expressed on the surface of insect cells and were characterized for cleavability and reactivity with serum samples obtained from patients during the convalescent phase of SARS. S protein could be cleaved by exogenous trypsin but not by coexpressed furin, suggesting that the protein is not normally processed during infection. Reactivity was evident by both flow cytometry and Western blot assays, but the pattern of reactivity varied according to assay and sequence of the antigen. The antibody response to SCoV S protein involves antibodies to both linear and conformational epitopes, with linear epitopes associated with the carboxyl domain and conformational epitopes associated with the amino terminal domain. Recombinant SCoV S protein appears to be a suitable antigen for the development of an efficient and sensitive diagnostic test for SARS, but our data suggest that assay format and choice of S antigen are important considerations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

G protein-coupled receptor kinases (GRKs) are regulatory enzymes involved in the modulation of seven-transmembrane-helix receptors. In order to develop specific inhibitors for these kinases, we synthesized and investigated peptide inhibitors derived from the sequence of the first intracellular loop of the beta(2)-adrenergic receptor. Introduction of changes in the sequence and truncation of N- and C-terminal amino acids increased the inhibitory potency by a factor of 40. These inhibitors not only inhibited the prototypical GRK2 but also GRK3 and GRK5. In contrast there was no inhibition of protein kinase C and protein kinase A even at the highest concentration tested. The peptide with the sequence AKFERLQTVTNYFITSE inhibited GRK2 with an IC50 of 0.6 mu M, GRK3 with 2.6 mu M and GRK5 with 1.6 mu M. The peptide inhibitors were non-competitive for receptor and ATP. These findings demonstrate that specific peptides can inhibit GRKs in the submicromolar range and suggest that a further decrease in size is possible without losing the inhibitory potency. (c) 2005 Published by Elsevier Inc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Unlike nuclear localization signals, there is no obvious consensus sequence for the targeting of proteins to the nucleolus. The nucleolus is a dynamic subnuclear structure which is crucial to the normal operation of the eukaryotic cell. Studying nucleolar trafficking signals is problematic as many nucleolar retention signals (NoRSs) are part of classical nuclear localization signals (NLSs). In addition, there is no known consensus signal with which to inform a study. The avian infectious bronchitis virus (IBV), coronavirus nucleocapsid (N) protein, localizes to the cytoplasm and the nucleolus. Mutagenesis was used to delineate a novel eight amino acid motif that was necessary and sufficient for nucleolar retention of N protein and colocalize with nucleolin and fibrillarin. Additionally, a classical nuclear export signal (NES) functioned to direct N protein to the cytoplasm. Comparison of the coronavirus NoRSs with known cellular and other viral NoRSs revealed that these motifs have conserved arginine residues. Molecular modelling, using the solution structure of severe acute respiratory (SARS) coronavirus N-protein, revealed that this motif is available for interaction with cellular factors which may mediate nucleolar localization. We hypothesise that the N-protein uses these signals to traffic to and from the nucleolus and the cytoplasm.