120 resultados para Subcellular localization prediction
em University of Queensland eSpace - Australia
Resumo:
Background: Determination of the subcellular location of a protein is essential to understanding its biochemical function. This information can provide insight into the function of hypothetical or novel proteins. These data are difficult to obtain experimentally but have become especially important since many whole genome sequencing projects have been finished and many resulting protein sequences are still lacking detailed functional information. In order to address this paucity of data, many computational prediction methods have been developed. However, these methods have varying levels of accuracy and perform differently based on the sequences that are presented to the underlying algorithm. It is therefore useful to compare these methods and monitor their performance. Results: In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations (nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, Golgi apparatus, endoplasmic reticulum (ER), peroxisome, and lysosome). The selected methods were CELLO, MultiLoc, Proteome Analyst, pTarget and WoLF PSORT. These methods were evaluated using 3763 mouse proteins from SwissProt that represent the source of the training sets used in development of the individual methods. In addition, an independent evaluation set of 2145 mouse proteins from LOCATE with a bias towards the subcellular localization underrepresented in SwissProt was used. The sensitivity and specificity were calculated for each method and compared to a theoretical value based on what might be observed by random chance. Conclusion: No individual method had a sufficient level of sensitivity across both evaluation sets that would enable reliable application to hypothetical proteins. All methods showed lower performance on the LOCATE dataset and variable performance on individual subcellular localizations was observed. Proteins localized to the secretory pathway were the most difficult to predict, while nuclear and extracellular proteins were predicted with the highest sensitivity.
Resumo:
Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.
Resumo:
Application of a computational membrane organization prediction pipeline, MemO, identified putative type II membrane proteins as proteins predicted to encode a single alpha-helical transmembrane domain (TMD) and no signal peptides. MemO was applied to RIKEN's mouse isoform protein set to identify 1436 non-overlapping genomic regions or transcriptional units (TUs), which encode exclusively type II membrane proteins. Proteins with overlapping predicted InterPro and TMDs were reviewed to discard false positive predictions resulting in a dataset comprised of 1831 transcripts in 1408 TUs. This dataset was used to develop a systematic protocol to document subcellular localization of type II membrane proteins. This approach combines mining of published literature to identify subcellular localization data and a high-throughput, polymerase chain reaction (PCR)-based approach to experimentally characterize subcellular localization. These approaches have provided localization data for 244 and 169 proteins. Type II membrane proteins are localized to all major organelle compartments; however, some biases were observed towards the early secretory pathway and punctate structures. Collectively, this study reports the subcellular localization of 26% of the defined dataset. All reported localization data are presented in the LOCATE database (http://www.locate.imb.uq.edu.au).
Resumo:
Skeletal muscle differentiation and the activation of muscle-specific gene expression are dependent on the concerted action of the MyoD family and the MADS protein, MEF2, which function in a cooperative manner. The steroid receptor coactivator SRC-2/GRIP-1/TIF-2, is necessary for skeletal muscle differentiation, and functions as a cofactor for the transcription factor, MEF2. SRC-P belongs to the SRC family of transcriptional coactivators/cofactors that also includes SRC-1 and SRC-3/RAC-3/ACTR/ AIB-1. In this study we demonstrate that SRC-P is essentially localized in the nucleus of proliferating myoblasts; however, weak (but notable) expression is observed in the cytoplasm. Differentiation induces a predominant localization of SRC-P to the nucleus; furthermore, the nuclear staining is progressively more localized to dot-like structures or nuclear bodies. MEF2 is primarily expressed in the nucleus, although we observed a mosaic or variegated expression pattern in myoblasts; however, in myotubes all nuclei express MEF2. GRIP-1 and MEF2 are coexpressed in the nucleus during skeletal muscle differentiation, consistent with the direct interaction of these proteins. Rhabdomyosarcoma (RMS) cells derived from malignant skeletal muscle tumors have been proposed to be deficient in cofactors. Alveolar RMS cells very weakly express the steroid receptor coactivator, SRC-P, in a diffuse nucleocytoplasmic staining pattern. MEF2 and the cofactors, SRC-1 and SRC-3 are abundantly expressed in alveolar and embryonal RMS cells; however, the staining is not localized to the nucleus. Furthermore, the subcellular localization and transcriptional activity of MEF2C and a MEF2-dependent reporter are compromised in alveolar RMS cells. In contrast, embryonal RMS cells express SRC-2 in the nucleus, and MEF2 shuttles from the cytoplasm to the nucleus after serum withdrawal. In conclusion, this study suggests that the steroid receptor coactivator SRC-P and MEF2 are localized to the nucleus during the differentiation process. In contrast, RMS cells display aberrant transcription factor SRC localization and expression, which may underlie certain features of the RMS phenotype.
Resumo:
The majority of GLUT4 is sequestered in unique intracellular vesicles in the absence of insulin. Upon insulin stimulation GLUT4 vesicles translocate to, and fuse with, the plasma membrane. To determine the effect of GLUT4 content on the distribution and subcellular trafficking of GLUT4 and other vesicle proteins, adipocytes of adipose-specific, GLUT4-deficient (aP2-GLUT4-/-) mice and adipose-specific, GLUT4-overexpressing (aP2GLUT4- Tg) mice were studied. GLUT4 amount was reduced by 80 - 95% in aP2-GLUT4-/- adipocytes and increased similar to10-fold in aP2-GLUT4-Tg adipocytes compared with controls. Insulin-responsive aminopeptidase ( IRAP) protein amount was decreased 35% in aP2-GLUT4-/- adipocytes and increased 45% in aP2-GLUT4-Tg adipocytes. VAMP2 protein was also decreased by 60% in aP2-GLUT4-/- adipocytes and increased 2-fold in aP2GLUT4- Tg adipocytes. IRAP and VAMP2 mRNA levels were unaffected in aP2-GLUT4-Tg, suggesting that overexpression of GLUT4 affects IRAP and VAMP2 protein stability. The amount and subcellular distribution of syntaxin4, SNAP23, Munc-18c, and GLUT1 were unchanged in either aP2-GLUT4-/- or aP2-GLUT4-Tg adipocytes, but transferrin receptor was partially redistributed to the plasma membrane in aP2-GLUT4-Tg adipocytes. Immunogold electron microscopy revealed that overexpression of GLUT4 in adipocytes increased the number of GLUT4 molecules per vesicle nearly 2-fold and the number of GLUT4 and IRAP-containing vesicles per cell 3-fold. In addition, the proportion of cellular GLUT4 and IRAP at the plasma membrane in unstimulated aP2-GLUT4-Tg adipocytes was increased 4- and 2-fold, respectively, suggesting that sequestration of GLUT4 and IRAP is saturable. Our results show that GLUT4 overexpression or deficiency affects the amount of other GLUT4-vesicle proteins including IRAP and VAMP2 and that GLUT4 sequestration is saturable.
Resumo:
The Raf-MEK-ERK MAP kinase cascade transmits signals from activated receptors into the cell to regulate proliferation and differentiation. The cascade is controlled by the Ras GTPase, which recruits Raf from the cytosol to the plasma membrane for activation. In turn, MEK, ERK, and scaffold proteins translocate to the plasma membrane for activation. Here, we examine the input-output properties of the Raf-MEK-ERK MAP kinase module in mammalian cells activated in different cellular contexts. We show that the MAP kinase module operates as a molecular switch in vivo but that the input sensitivity of the module is determined by subcellular location. Signal output from the module is sensitive to low-level input only when it is activated at the plasma membrane. This is because the threshold for activation is low at the plasma membrane, whereas the threshold for activation is high in the cytosol. Thus, the circuit configuration of the module at the plasma membrane generates maximal outputs from low-level analog inputs, allowing cells to process and respond appropriately to physiological stimuli. These results reveal the engineering logic behind the recruitment of elements of the module from the cytosol to the membrane for activation.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
The effect of replacing a single codon in the N-terminal of human aryl sulfotransferase (HAST) 1 and 3 with one that is more commonly found in E. coli genes was assessed. The pKK233-2 E. coli expression vector was employed and the polymerase chain reaction (PCR) was used to introduce the 5' nucleotide substitution, at the same time maintaining the fidelity of the amino acid sequence. The data indicates that this change had a minimal effect on protein production, subcellular localization or, in the case of HAST3, catalytic activity. In general, the pKK233-2 E. coli vector has been less than optimal for expressing human sulfotransferase cDNAs. (C) 1998 Elsevier Science Ireland Ltd. All rights reserved.
Resumo:
The compact myelin sheath represents one of the largest expanses of membrane-membrane contact in the body and, in the central nervous system, requires the myelin proteolipid protein (PLP) for assembly, To determine whether the molecular properties of PLP promote membrane adhesion and direct its subcellular localization in the absence of oligodendrocyte-specific targeting mechanisms, PLP was expressed in COS-I fibroblasts, Immunofluorescence staining indicated that PUP was translated effectively, transited the rough endoplasmic reticulum and Golgi apparatus, was delivered to the cell surface, and was endocytosed, In the plasma membrane, the PLP distribution was patchy and only sporadically coincided with sites of membrane-membrane contact between PLP-expressing cells, PLP was not randomly distributed, however, but correlated closely with microfilament locations in leading edge membranes and microvilli, as demonstrated by phalloidin double labeling, Our results indicate that even in non-myelinating cells, PLP can be concentrated in membranes associated with movement and growth, and suggest possible roles for the actin cytoskeleton in PLP localization, As PLP, DM20, and the DM20-like M6 protein all associate with actin-enriched membranes, this may be a common feature of PLP/DM20 gene family members. (C) 1997 Wiley-Liss, Inc.
Resumo:
Hsp10 (10-kDa heat shock protein, also known as chaperonin 10 or Cpn10) is a co-chaperone for Hsp60 in the protein folding process. This protein has also been shown to be identical to the early pregnancy factor, which is an immunosuppressive growth factor found in maternal serum. In this study we have used immunogold electron microscopy to study the subcellular localization of Hsp10 in rat tissues sections embedded in LR Gold resin employing polyclonal antibodies raised against different regions of human Hsp10. In all rat tissues examined including liver, heart, pancreas, kidney, anterior pituitary, salivary gland, thyroid, and adrenal gland, antibodies to Hsp10 showed strong labeling of mitochondria. However, in a number of tissues, in addition to the mitochondrial labeling, strong and highly specific labeling with the Hsp10 antibodies was also observed in several extramitochondrial compartments. These sites included zymogen granules in pancreatic acinar cells, growth hormone granules in anterior pituitary, and secretory granules in PP pancreatic islet cells. Additionally, the mature red blood cells which lack mitochondria, also showed strong reactivity with the Hsp10 antibodies. The observed labeling with the Hsp10 antibodies, both within mitochondria as well as in other compartments/cells, was abolished upon omission of the primary antibodies or upon preadsorption of the primary antibodies with the purified recombinant human Hsp10. These results provide evidence that similar to a number of other recently described mitochondrial proteins (viz., Hsp60, tumor necrosis factor receptor-associated protein- 1, P32 (gC1q-R) protein, and cytochrome c), Hsp10 is also found at a variety of specific extramitochondrial sites in normal rat tissue. These results raise important questions as to how these mitochondrial proteins are translocated to other compartments and their possible function(s) at these sites. The presence of these proteins at extramitochondrial sites in normal tissues has important implications concerning the role of mitochondria in apoptosis and genetic diseases.