381 resultados para Protein structures
em Queensland University of Technology - ePrints Archive
Clustering of Protein Structures Using Hydrophobic Free Energy And Solvent Accessibility of Proteins
Resumo:
Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.
Resumo:
Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php
Resumo:
One of the next great challenges of cell biology is the determination of the enormous number of protein structures encoded in genomes. In recent years, advances in electron cryo-microscopy and high-resolution single particle analysis have developed to the point where they now provide a methodology for high resolution structure determination. Using this approach, images of randomly oriented single particles are aligned computationally to reconstruct 3-D structures of proteins and even whole viruses. One of the limiting factors in obtaining high-resolution reconstructions is obtaining a large enough representative dataset ($>100,000$ particles). Traditionally particles have been manually picked which is an extremely labour intensive process. The problem is made especially difficult by the low signal-to-noise ratio of the images. This paper describes the development of automatic particle picking software, which has been tested with both negatively stained and cryo-electron micrographs. This algorithm has been shown to be capable of selecting most of the particles, with few false positives. Further work will involve extending the software to detect differently shaped and oriented particles.
Resumo:
Infection of plant cells by potyviruses induces the formation of cytoplasmic inclusions ranging in size from 200 to 1000 nm. To determine if the ability to form these ordered, insoluble structures is intrinsic to the potyviral cytoplasmic inclusion protein, we have expressed the cytoplasmic inclusion protein from Potato virus Y in tobacco under the control of the chrysanthemum ribulose-1,5-bisphosphate carboxylase small subunit promoter, a highly active, green tissue promoter. No cytoplasmic inclusions were observed in the leaves of transgenic tobacco using transmission electron microscopy, despite being able to clearly visualize these inclusions in Potato virus Y infected tobacco leaves under the same conditions. However, we did observe a wide range of tissue and sub-cellular abnormalities associated with the expression of the Potato virus Y cytoplasmic inclusion protein. These changes included the disruption of normal cell morphology and organization in leaves, mitochondrial and chloroplast internal reorganization, and the formation of atypical lipid accumulations. Despite these significant structural changes, however, transgenic tobacco plants were viable and the results are discussed in the context of potyviral cytoplasmic inclusion protein function.
Resumo:
Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.
Resumo:
Using six kinds of lattice types (4×4 ,5×5 , and6×6 square lattices;3×3×3 cubic lattice; and2+3+4+3+2 and4+5+6+5+4 triangular lattices), three different size alphabets (HP ,HNUP , and 20 letters), and two energy functions, the designability of proteinstructures is calculated based on random samplings of structures and common biased sampling (CBS) of proteinsequence space. Then three quantities stability (average energy gap),foldability, and partnum of the structure, which are defined to elucidate the designability, are calculated. The authors find that whatever the type of lattice, alphabet size, and energy function used, there will be an emergence of highly designable (preferred) structure. For all cases considered, the local interactions reduce degeneracy and make the designability higher. The designability is sensitive to the lattice type, alphabet size, energy function, and sampling method of the sequence space. Compared with the random sampling method, both the CBS and the Metropolis Monte Carlo sampling methods make the designability higher. The correlation coefficients between the designability, stability, and foldability are mostly larger than 0.5, which demonstrate that they have strong correlation relationship. But the correlation relationship between the designability and the partnum is not so strong because the partnum is independent of the energy. The results are useful in practical use of the designability principle, such as to predict the proteintertiary structure.
Resumo:
Epidermal growth factor (EGF) activation of the EGF receptor (EGFR) is an important mediator of cell migration, and aberrant signaling via this system promotes a number of malignancies including ovarian cancer. We have identified the cell surface glycoprotein CDCP1 as a key regulator of EGF/EGFR-induced cell migration. We show that signaling via EGF/EGFR induces migration of ovarian cancer Caov3 and OVCA420 cells with concomitant up-regulation of CDCP1 mRNA and protein. Consistent with a role in cell migration CDCP1 relocates from cell-cell junctions to punctate structures on filopodia after activation of EGFR. Significantly, disruption of CDCP1 either by silencing or the use of a function blocking antibody efficiently reduces EGF/EGFR-induced cell migration of Caov3 and OVCA420 cells. We also show that up-regulation of CDCP1 is inhibited by pharmacological agents blocking ERK but not Src signaling, indicating that the RAS/RAF/MEK/ERK pathway is required downstream of EGF/EGFR to induce increased expression of CDCP1. Our immunohistochemical analysis of benign, primary, and metastatic serous epithelial ovarian tumors demonstrates that CDCP1 is expressed during progression of this cancer. These data highlight a novel role for CDCP1 in EGF/EGFR-induced cell migration and indicate that targeting of CDCP1 may be a rational approach to inhibit progression of cancers driven by EGFR signaling including those resistant to anti-EGFR drugs because of activating mutations in the RAS/RAF/MEK/ERK pathway.
Resumo:
Ubiquitination involves the attachment of ubiquitin (Ub) to lysine residues on substrate proteins or itself, which can result in protein monoubiquitination or polyubiquitination. Polyubiquitination through different lysines (seven) or the N-terminus of Ub can generate different protein-Ub structures. These include monoubiquitinated proteins, polyubiqutinated proteins with homotypic chains through a particular lysine on Ub or mixed polyubiquitin chains generated by polymerization through different Ub lysines. The ability of the ubiquitination pathway to generate different protein-Ub structures provides versatility of this pathway to target proteins to different fates. Protein ubiquitination is catalyzed by Ub-conjugating and Ub-ligase enzymes, with different combinations of these enzymes specifying the type of Ub modification on protein substrates. How Ub-conjugating and Ub-ligase enzymes generate this structural diversity is not clearly understood. In the current review, we discuss mechanisms utilized by the Ub-conjugating and Ub-ligase enzymes to generate structural diversity during protein ubiquitination, with a focus on recent mechanistic insights into protein monoubiquitination and polyubiquitination.
Resumo:
Transport between compartments of eukaryotic cells is mediated by coated vesicles. The archetypal protein coats COPI, COPII, and clathrin are conserved from yeast to human. Structural studies of COPII and clathrin coats assembled in vitro without membranes suggest that coat components assemble regular cages with the same set of interactions between components. Detailed three-dimensional structures of coated membrane vesicles have not been obtained. Here, we solved the structures of individual COPI-coated membrane vesicles by cryoelectron tomography and subtomogram averaging of in vitro reconstituted budding reactions. The coat protein complex, coatomer, was observed to adopt alternative conformations to change the number of other coatomers with which it interacts and to form vesicles with variable sizes and shapes. This represents a fundamentally different basis for vesicle coat assembly.
Resumo:
The detailed characterization of protein N-glycosylation is very demanding given the many different glycoforms and structural isomers that can exist on glycoproteins. Here we report a fast and sensitive method for the extensive structure elucidation of reducing-end labeled N-glycan mixtures using a combination of capillary normal-phase HPLC coupled off-line to matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) and TOF/TOF-MS/MS. Using this method, isobaric N-glycans released from honey bee phospholipase A2 and Arabidopsis thaliana glycoproteins were separated by normal-phase chromatography and subsequently identified by key fragment ions in the MALDI-TOF/TOF tandem mass spectra. In addition, linkage and branching information were provided by abundant cross-ring and "elimination" fragment ions in the MALDI-CID spectra that gave extensive structural information. Furthermore, the fragmentation characteristics of N-glycans reductively aminated with 2-aminobenzoic acid and 2-aminobenzamide were compared. The identification of N-glycans containing 3-linked core fucose was facilitated by distinctive ions present only in the MALDI-CID spectra of 2-aminobenzoic acid-labeled oligosaccharides. To our knowledge, this is the first MS/MS-based technique that allows confident identification of N-glycans containing 3-linked core fucose, which is a major allergenic determinant on insect and plant glycoproteins.
Resumo:
Potato leafroll virus (PLRV) is a positive-strand RNA virus that generates subgenomic RNAs (sgRNA) for expression of 3' proximal genes. Small RNA (sRNA) sequencing and mapping of the PLRV-derived sRNAs revealed coverage of the entire viral genome with the exception of four distinctive gaps. Remarkably, these gaps mapped to areas of PLRV genome with extensive secondary structures, such as the internal ribosome entry site and 5' transcriptional start site of sgRNA1 and sgRNA2. The last gap mapped to ~500. nt from the 3' terminus of PLRV genome and suggested the possible presence of an additional sgRNA for PLRV. Quantitative real-time PCR and northern blot analysis confirmed the expression of sgRNA3 and subsequent analyses placed its 5' transcriptional start site at position 5347 of PLRV genome. A regulatory role is proposed for the PLRV sgRNA3 as it encodes for an RNA-binding protein with specificity to the 5' of PLRV genomic RNA. © 2013.
Resumo:
Aberrant DNA replication is a primary cause of mutations that are associated with pathological disorders including cancer. During DNA metabolism, the primary causes of replication fork stalling include secondary DNA structures, highly transcribed regions and damaged DNA. The restart of stalled replication forks is critical for the timely progression of the cell cycle and ultimately for the maintenance of genomic stability. Our previous work has implicated the single-stranded DNA binding protein, hSSB1/NABP2, in the repair of DNA double-strand breaks via homologous recombination. Here, we demonstrate that hSSB1 relocates to hydroxyurea (HU)-damaged replication forks where it is required for ATR and Chk1 activation and recruitment of Mre11 and Rad51. Consequently, hSSB1-depleted cells fail to repair and restart stalled replication forks. hSSB1 deficiency causes accumulation of DNA strand breaks and results in chromosome aberrations observed in mitosis, ultimately resulting in hSSB1 being required for survival to HU and camptothecin. Overall, our findings demonstrate the importance of hSSB1 in maintaining and repairing DNA replication forks and for overall genomic stability.
Resumo:
A precise representation of the spatial distribution of hydrophobicity, hydrophilicity and charges on the molecular surface of proteins is critical for the understanding of the interaction with small molecules and larger systems. The representation of hydrophobicity is rarely done at atom-level, as this property is generally assigned to residues. A new methodology for the derivation of atomic hydrophobicity from any amino acid-based hydrophobicity scale was used to derive 8 sets of atomic hydrophobicities, one of which was used to generate the molecular surfaces for 35 proteins with convex structures, 5 of which, i.e., lysozyme, ribonuclease, hemoglobin, albumin and IgG, have been analyzed in more detail. Sets of the molecular surfaces of the model proteins have been constructed using spherical probes with increasingly large radii, from 1.4 to 20 A˚, followed by the quantification of (i) the surface hydrophobicity; (ii) their respective molecular surface areas, i.e., total, hydrophilic and hydrophobic area; and (iii) their relative densities, i.e., divided by the total molecular area; or specific densities, i.e., divided by property-specific area. Compared with the amino acid-based formalism, the atom-level description reveals molecular surfaces which (i) present an approximately two times more hydrophilic areas; with (ii) less extended, but between 2 to 5 times more intense hydrophilic patches; and (iii) 3 to 20 times more extended hydrophobic areas. The hydrophobic areas are also approximately 2 times more hydrophobicity-intense. This, more pronounced "leopard skin"-like, design of the protein molecular surface has been confirmed by comparing the results for a restricted set of homologous proteins, i.e., hemoglobins diverging by only one residue (Trp37). These results suggest that the representation of hydrophobicity on the protein molecular surfaces at atom-level resolution, coupled with the probing of the molecular surface at different geometric resolutions, can capture processes that are otherwise obscured to the amino acid-based formalism.