112 resultados para Predicting Signal Peptides

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Signal peptides and transmembrane helices both contain a stretch of hydrophobic amino acids. This common feature makes it difficult for signal peptide and transmembrane helix predictors to correctly assign identity to stretches of hydrophobic residues near the N-terminal methionine of a protein sequence. The inability to reliably distinguish between N-terminal transmembrane helix and signal peptide is an error with serious consequences for the prediction of protein secretory status or transmembrane topology. In this study, we report a new method for differentiating protein N-terminal signal peptides and transmembrane helices. Based on the sequence features extracted from hydrophobic regions (amino acid frequency, hydrophobicity, and the start position), we set up discriminant functions and examined them on non-redundant datasets with jackknife tests. This method can incorporate other signal peptide prediction methods and achieve higher prediction accuracy. For Gram-negative bacterial proteins, 95.7% of N-terminal signal peptides and transmembrane helices can be correctly predicted (coefficient 0.90). Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 99% (coefficient 0.92). For eukaryotic proteins, 94.2% of N-terminal signal peptides and transmembrane helices can be correctly predicted with coefficient 0.83. Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 87% (coefficient 0.85). The method can be used to complement current transmembrane protein prediction and signal peptide prediction methods to improve their prediction accuracies. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proteins secreted by and anchored on the surfaces of parasites are in intimate contact with host tissues. The transcriptome of infective cercariae of the blood fluke, Schistosoma mansoni, was screened using signal sequence trap to isolate cDNAs encoding predicted proteins with an N-terminal signal peptide. Twenty cDNA fragments were identified, most of which contained predicted signal peptides or transmembrane regions, including a novel putative seven-transmembrane receptor and a membrane-associated mitogen-activated protein kinase. The developmental expression pattern within different life-cycle stages ranged from ubiquitous to a transcript that was highly upregulated in the cercaria. A bioinformatics-based comparison of 100 signal peptides from each of schistosomes, humans, a parasitic nematode and Escherichia coli showed that differences in the sequence composition of signal peptides, notably the residues flanking the predicted cleavage site, might account for the negative bias exhibited in the processing of schistosome signal peptides in mammalian cells. (c) 2005 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi- spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane. Application of MemO, a membrane organization annotation pipeline, to the FANTOM3 Isoform Protein Sequence mouse protein set revealed that within the 8,032 transcriptional units ( TUs) with multiple protein isoforms, 573 had variation in their use of signal peptides, 1,527 had variation in their use of transmembrane domains, and 615 generated protein isoforms from distinct membrane organization classes. The mechanisms underlying these transcript variations were analyzed. While TUs were identified encoding all pairwise combinations of membrane organization categories, the most common was conversion of membrane proteins to soluble proteins. Observed within our highconfidence set were 156 TUs predicted to generate both extracellular soluble and membrane proteins, and 217 TUs generating both intracellular soluble and membrane proteins. The differential use of endoplasmic reticulum signal peptides and transmembrane domains is a common occurrence within the variable protein output of TUs. The generation of protein isoforms that are targeted to multiple subcellular locations represents a major functional consequence of transcript variation within the mouse transcriptome.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivation: A major issue in cell biology today is how distinct intracellular regions of the cell, like the Golgi Apparatus, maintain their unique composition of proteins and lipids. The cell differentially separates Golgi resident proteins from proteins that move through the organelle to other subcellular destinations. We set out to determine if we could distinguish these two types of transmembrane proteins using computational approaches. Results: A new method has been developed to predict Golgi membrane proteins based on their transmembrane domains. To establish the prediction procedure, we took the hydrophobicity values and frequencies of different residues within the transmembrane domains into consideration. A simple linear discriminant function was developed with a small number of parameters derived from a dataset of Type II transmembrane proteins of known localization. This can discriminate between proteins destined for Golgi apparatus or other locations (post-Golgi) with a success rate of 89.3% or 85.2%, respectively on our redundancy-reduced data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new method has been developed for prediction of transmembrane helices using support vector machines. Different coding schemes of protein sequences were explored, and their performances were assessed by crossvalidation tests. The best performance method can predict the transmembrane helices with sensitivity of 93.4% and precision of 92.0%. For each predicted transmembrane segment, a score is given to show the strength of transmembrane signal and the prediction reliability. In particular, this method can distinguish transmembrane proteins from soluble proteins with an accuracy of similar to99%. This method can be used to complement current transmembrane helix prediction methods and can be Used for consensus analysis of entire proteomes . The predictor is located at http://genet.imb.uq.edu.au/predictors/ SVMtm. (C) 2004 Wiley Periodicals, Inc.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The gastrointestinal tracts of multi-cellular blood-feeding parasites are targets for vaccines and drugs. Recently, recombinant vaccines that interrupt the digestion of blood in the hookworm gut have shown efficacy, so we explored the intestinal transcriptomes of the human and canine hookworms, Necator americanus and Ancylostoma caninum, respectively. We used Laser Microdissection Microscopy to dissect gut tissue from the parasites, extracted the RNA and generated cDNA libraries. A total of 480 expressed sequence tags were sequenced from each library and assembled into contigs, accounting for 268 N. americanus genes and 276 A. caninum genes. Only 17% of N. americanus and 36% of A. caninum contigs were assigned Gene Ontology classifications. Twenty-six (9.8%) N. americanus and 18 (6.5%) A. caninum contigs did not have homologues in any databases including dbEST-of these novel clones, seven N. americanus and three A. caninum contigs had Open Reading Frames with predicted secretory signal peptides. The most abundant transcripts corresponded to mRNAs encoding cholesterol-and fatty acid-binding proteins, C-type lectins, Activation-Associated Secretory Proteins, and proteases of different mechanistic classes, particularly astacin-like metallopeptidases. Expressed sequence tags corresponding to known and potential recombinant vaccines were identified and these included homologues of proteases, anti-clotting factors, defensins and integral membrane proteins involved in cell adhesion. (c) 2006 Australian Society for Parasitology Inc Published by Elsevier Ltd. All fights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Application of a computational membrane organization prediction pipeline, MemO, identified putative type II membrane proteins as proteins predicted to encode a single alpha-helical transmembrane domain (TMD) and no signal peptides. MemO was applied to RIKEN's mouse isoform protein set to identify 1436 non-overlapping genomic regions or transcriptional units (TUs), which encode exclusively type II membrane proteins. Proteins with overlapping predicted InterPro and TMDs were reviewed to discard false positive predictions resulting in a dataset comprised of 1831 transcripts in 1408 TUs. This dataset was used to develop a systematic protocol to document subcellular localization of type II membrane proteins. This approach combines mining of published literature to identify subcellular localization data and a high-throughput, polymerase chain reaction (PCR)-based approach to experimentally characterize subcellular localization. These approaches have provided localization data for 244 and 169 proteins. Type II membrane proteins are localized to all major organelle compartments; however, some biases were observed towards the early secretory pathway and punctate structures. Collectively, this study reports the subcellular localization of 26% of the defined dataset. All reported localization data are presented in the LOCATE database (http://www.locate.imb.uq.edu.au).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a composite multi-layer classifier system for predicting the subcellular localization of proteins based on their amino acid sequence. The work is an extension of our previous predictor PProwler v1.1 which is itself built upon the series of predictors SignalP and TargetP. In this study we outline experiments conducted to improve the classifier design. The major improvement came from using Support Vector machines as a "smart gate" sorting the outputs of several different targeting peptide detection networks. Our final model (PProwler v1.2) gives MCC values of 0.873 for non-plant and 0.849 for plant proteins. The model improves upon the accuracy of our previous subcellular localization predictor (PProwler v1.1) by 2% for plant data (which represents 7.5% improvement upon TargetP).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

T cells recognize peptide epitopes bound to major histocompatibility complex molecules. Human T-cell epitopes have diagnostic and therapeutic applications in autoimmune diseases. However, their accurate definition within an autoantigen by T-cell bioassay, usually proliferation, involves many costly peptides and a large amount of blood, We have therefore developed a strategy to predict T-cell epitopes and applied it to tyrosine phosphatase IA-2, an autoantigen in IDDM, and HLA-DR4(*0401). First, the binding of synthetic overlapping peptides encompassing IA-2 was measured directly to purified DR4. Secondly, a large amount of HLA-DR4 binding data were analysed by alignment using a genetic algorithm and were used to train an artificial neural network to predict the affinity of binding. This bioinformatic prediction method was then validated experimentally and used to predict DR4 binding peptides in IA-2. The binding set encompassed 85% of experimentally determined T-cell epitopes. Both the experimental and bioinformatic methods had high negative predictive values, 92% and 95%, indicating that this strategy of combining experimental results with computer modelling should lead to a significant reduction in the amount of blood and the number of peptides required to define T-cell epitopes in humans.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models can be combined with laboratory experiments for the efficient determination of (i) peptides that bind MHC molecules and (ii) T-cell epitopes. For maximum benefit, the use of computer models must be treated as experiments analogous to standard laboratory procedures. This requires the definition of standards and experimental protocols for model application. We describe the requirements for validation and assessment of computer models. The utility of combining accurate predictions with a limited number of laboratory experiments is illustrated by practical examples. These include the identification of T-cell epitopes from IDDM-, melanoma- and malaria-related antigens by combining computational and conventional laboratory assays. The success rate in determining antigenic peptides, each in the context of a specific HLA molecule, ranged from 27 to 71%, while the natural prevalence of MHC-binding peptides is 0.1-5%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine learning techniques have been recognized as powerful tools for learning from data. One of the most popular learning techniques, the Back-Propagation (BP) Artificial Neural Networks, can be used as a computer model to predict peptides binding to the Human Leukocyte Antigens (HLA). The major advantage of computational screening is that it reduces the number of wet-lab experiments that need to be performed, significantly reducing the cost and time. A recently developed method, Extreme Learning Machine (ELM), which has superior properties over BP has been investigated to accomplish such tasks. In our work, we found that the ELM is as good as, if not better than, the BP in term of time complexity, accuracy deviations across experiments, and most importantly - prevention from over-fitting for prediction of peptide binding to HLA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The signal sequence trap technique was applied to identify genes coding for secreted and membrane bound proteins from Echinococcus granulosus, the etiologic agent of cystic hydatid disease. An E. granulosus protoscolex cDNA library was constructed in the AP-PST vector such that randomly primed cDNAs were fused with a placental alkaline phosphatase reporter gene lacking its endogenous signal peptide. E. granulosus cDNAs encoding a functional signal peptide were selected by their ability to rescue secretion of alkaline phosphatase by COS-7 cells that had been transfected with the cDNA library. Eighteen positive clones were identified and sequenced. Their deduced amino acid sequences showed significant similarity with amino acid transporters, Krebs cycle intermediates transporters, presenilins and vacuolar protein sorter proteins. Other cDNAs encoded secreted proteins without homologues. Three sequences were transcribed antisense to E. granulosus expressed sequence tags. All the mRNAs were expressed in protoscoleces and adult worms, but some of them were not found in oncospheres. The putative E. granulosus secreted and membrane bound proteins identified are likely to play important roles in the metabolism, development and survival in the host and represent potential targets for diagnosis, drugs and vaccines against E. granulosus. (c) 2005 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: While processing of MHC class II antigens for presentation to helper T-cells is essential for normal immune response, it is also implicated in the pathogenesis of autoimmune disorders and hypersensitivity reactions. Sequence-based computational techniques for predicting HLA-DQ binding peptides have encountered limited success, with few prediction techniques developed using three-dimensional models. Methods: We describe a structure-based prediction model for modeling peptide-DQ3.2 beta complexes. We have developed a rapid and accurate protocol for docking candidate peptides into the DQ3.2 beta receptor and a scoring function to discriminate binders from the background. The scoring function was rigorously trained, tested and validated using experimentally verified DQ3.2 beta binding and non-binding peptides obtained from biochemical and functional studies. Results: Our model predicts DQ3.2 beta binding peptides with high accuracy [area under the receiver operating characteristic (ROC) curve A(ROC) > 0.90], compared with experimental data. We investigated the binding patterns of DQ3.2 beta peptides and illustrate that several registers exist within a candidate binding peptide. Further analysis reveals that peptides with multiple registers occur predominantly for high-affinity binders.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new relative measure of signal complexity, referred to here as relative structural complexity, which is based on the matching pursuit (MP) decomposition. By relative, we refer to the fact that this new measure is highly dependent on the decomposition dictionary used by MP. The structural part of the definition points to the fact that this new measure is related to the structure, or composition, of the signal under analysis. After a formal definition, the proposed relative structural complexity measure is used in the analysis of newborn EEG. To do this, firstly, a time-frequency (TF) decomposition dictionary is specifically designed to compactly represent the newborn EEG seizure state using MP. We then show, through the analysis of synthetic and real newborn EEG data, that the relative structural complexity measure can indicate changes in EEG structure as it transitions between the two EEG states; namely seizure and background (non-seizure).