945 resultados para Structure Prediction Servers
Resumo:
Im Rahmen dieser Arbeit wurden Signalwege untersucht, die an der Migration der embryona-len peripheren Gliazellen (ePG) beteiligt sind. Der Fokus lag dabei auf Myoblast city (Mbc). Zunächst wurden dazu unterschiedliche mbc Mutanten analysiert, bei denen es zu starken glialen Migrationsdefekten kommt. Um die auftretenden Phänotypen quantitativ zu analysieren, wurde eine Methode entwickelt um die Position der Pionierglia ePG9 zu bestimmen. Dies ermöglicht es, auch sehr subtile gliale Migrationsphänotypen zu detektieren. Durch knock-down Experimente konnte gezeigt werden, dass Mbc eine zellautonome Rolle bei der glialen Migration spielt. Besonders interessant ist die Tatsache, dass während der Migration der ePG eine alternativ gespleißte Isoform benötigt wird, die bisher kaum untersucht wurde. Durch Strukturvorhersagen konnte gezeigt werden, dass sich der Bereich in dem sich die beiden Isoformen unterscheiden, in einer Region liegt, die sich zu HEAT-repeats faltet. Mbc-PB scheint somit über einen Bereich zu verfügen, der im Vergleich zu Mbc-PA, zusätzliche Interaktionen erlaubt. Zudem scheint es mehrere Phosphorylierungsstellen zu geben, die für die Inaktivierung von Mbc-PB notwendig sind. Die Kinase Wallenda konnte als Kandidat identifiziert werden, der für die Phosphorylierung von Mbc-PB verantwortlich ist. Weitere Experimente zeigten eine einen zellautonomen Einfluss von Mbc-PB auf ePG7, die indirekt die Migration der Pionierglia ePG9 beeinflusst.
Resumo:
The 5' cap structure of trypanosomatid mRNAs, denoted cap 4, is a complex structure that contains unusual modifications on the first four nucleotides. We examined the four eukaryotic initiation factor 4E (eIF4E) homologues found in the Leishmania genome database. These proteins, denoted LeishIF4E-1 to LeishIF4E-4, are located in the cytoplasm. They show only a limited degree of sequence homology with known eIF4E isoforms and among themselves. However, computerized structure prediction suggests that the cap-binding pocket is conserved in each of the homologues, as confirmed by binding assays to m(7)GTP, cap 4, and its intermediates. LeishIF4E-1 and LeishIF4E-4 each bind m(7)GTP and cap 4 comparably well, and only these two proteins could interact with the mammalian eIF4E binding protein 4EBP1, though with different efficiencies. 4EBP1 is a translation repressor that competes with eIF4G for the same residues on eIF4E; thus, LeishIF4E-1 and LeishIF4E-4 are reasonable candidates for serving as translation factors. LeishIF4E-1 is more abundant in amastigotes and also contains a typical 3' untranslated region element that is found in amastigote-specific genes. LeishIF4E-2 bound mainly to cap 4 and comigrated with polysomal fractions on sucrose gradients. Since the consensus eIF4E is usually found in 48S complexes, LeishIF4E-2 could possibly be associated with the stabilization of trypanosomatid polysomes. LeishIF4E-3 bound mainly m(7)GTP, excluding its involvement in the translation of cap 4-protected mRNAs. It comigrates with 80S complexes which are resistant to micrococcal nuclease, but its function is yet unknown. None of the isoforms can functionally complement the Saccharomyces cerevisiae eIF4E, indicating that despite their structural conservation, they are considerably diverged.
Resumo:
Amyloids and prion proteins are clinically and biologically important beta-structures, whose supersecondary structures are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Recent work has indicated the utility of pairwise probabilistic statistics in beta-structure prediction. We develop here a new strategy for beta-structure prediction, emphasizing the determination of beta-strands and pairs of beta-strands as fundamental units of beta-structure. Our program, BETASCAN, calculates likelihood scores for potential beta-strands and strand-pairs based on correlations observed in parallel beta-sheets. The program then determines the strands and pairs with the greatest local likelihood for all of the sequence's potential beta-structures. BETASCAN suggests multiple alternate folding patterns and assigns relative a priori probabilities based solely on amino acid sequence, probability tables, and pre-chosen parameters. The algorithm compares favorably with the results of previous algorithms (BETAPRO, PASTA, SALSA, TANGO, and Zyggregator) in beta-structure prediction and amyloid propensity prediction. Accurate prediction is demonstrated for experimentally determined amyloid beta-structures, for a set of known beta-aggregates, and for the parallel beta-strands of beta-helices, amyloid-like globular proteins. BETASCAN is able both to detect beta-strands with higher sensitivity and to detect the edges of beta-strands in a richly beta-like sequence. For two proteins (Abeta and Het-s), there exist multiple sets of experimental data implying contradictory structures; BETASCAN is able to detect each competing structure as a potential structure variant. The ability to correlate multiple alternate beta-structures to experiment opens the possibility of computational investigation of prion strains and structural heterogeneity of amyloid. BETASCAN is publicly accessible on the Web at http://betascan.csail.mit.edu.
Resumo:
Essential biological processes are governed by organized, dynamic interactions between multiple biomolecular systems. Complexes are thus formed to enable the biological function and get dissembled as the process is completed. Examples of such processes include the translation of the messenger RNA into protein by the ribosome, the folding of proteins by chaperonins or the entry of viruses in host cells. Understanding these fundamental processes by characterizing the molecular mechanisms that enable then, would allow the (better) design of therapies and drugs. Such molecular mechanisms may be revealed trough the structural elucidation of the biomolecular assemblies at the core of these processes. Various experimental techniques may be applied to investigate the molecular architecture of biomolecular assemblies. High-resolution techniques, such as X-ray crystallography, may solve the atomic structure of the system, but are typically constrained to biomolecules of reduced flexibility and dimensions. In particular, X-ray crystallography requires the sample to form a three dimensional (3D) crystal lattice which is technically di‑cult, if not impossible, to obtain, especially for large, dynamic systems. Often these techniques solve the structure of the different constituent components within the assembly, but encounter difficulties when investigating the entire system. On the other hand, imaging techniques, such as cryo-electron microscopy (cryo-EM), are able to depict large systems in near-native environment, without requiring the formation of crystals. The structures solved by cryo-EM cover a wide range of resolutions, from very low level of detail where only the overall shape of the system is visible, to high-resolution that approach, but not yet reach, atomic level of detail. In this dissertation, several modeling methods are introduced to either integrate cryo-EM datasets with structural data from X-ray crystallography, or to directly interpret the cryo-EM reconstruction. Such computational techniques were developed with the goal of creating an atomic model for the cryo-EM data. The low-resolution reconstructions lack the level of detail to permit a direct atomic interpretation, i.e. one cannot reliably locate the atoms or amino-acid residues within the structure obtained by cryo-EM. Thereby one needs to consider additional information, for example, structural data from other sources such as X-ray crystallography, in order to enable such a high-resolution interpretation. Modeling techniques are thus developed to integrate the structural data from the different biophysical sources, examples including the work described in the manuscript I and II of this dissertation. At intermediate and high-resolution, cryo-EM reconstructions depict consistent 3D folds such as tubular features which in general correspond to alpha-helices. Such features can be annotated and later on used to build the atomic model of the system, see manuscript III as alternative. Three manuscripts are presented as part of the PhD dissertation, each introducing a computational technique that facilitates the interpretation of cryo-EM reconstructions. The first manuscript is an application paper that describes a heuristics to generate the atomic model for the protein envelope of the Rift Valley fever virus. The second manuscript introduces the evolutionary tabu search strategies to enable the integration of multiple component atomic structures with the cryo-EM map of their assembly. Finally, the third manuscript develops further the latter technique and apply it to annotate consistent 3D patterns in intermediate-resolution cryo-EM reconstructions. The first manuscript, titled An assembly model for Rift Valley fever virus, was submitted for publication in the Journal of Molecular Biology. The cryo-EM structure of the Rift Valley fever virus was previously solved at 27Å-resolution by Dr. Freiberg and collaborators. Such reconstruction shows the overall shape of the virus envelope, yet the reduced level of detail prevents the direct atomic interpretation. High-resolution structures are not yet available for the entire virus nor for the two different component glycoproteins that form its envelope. However, homology models may be generated for these glycoproteins based on similar structures that are available at atomic resolutions. The manuscript presents the steps required to identify an atomic model of the entire virus envelope, based on the low-resolution cryo-EM map of the envelope and the homology models of the two glycoproteins. Starting with the results of the exhaustive search to place the two glycoproteins, the model is built iterative by running multiple multi-body refinements to hierarchically generate models for the different regions of the envelope. The generated atomic model is supported by prior knowledge regarding virus biology and contains valuable information about the molecular architecture of the system. It provides the basis for further investigations seeking to reveal different processes in which the virus is involved such as assembly or fusion. The second manuscript was recently published in the of Journal of Structural Biology (doi:10.1016/j.jsb.2009.12.028) under the title Evolutionary tabu search strategies for the simultaneous registration of multiple atomic structures in cryo-EM reconstructions. This manuscript introduces the evolutionary tabu search strategies applied to enable a multi-body registration. This technique is a hybrid approach that combines a genetic algorithm with a tabu search strategy to promote the proper exploration of the high-dimensional search space. Similar to the Rift Valley fever virus, it is common that the structure of a large multi-component assembly is available at low-resolution from cryo-EM, while high-resolution structures are solved for the different components but lack for the entire system. Evolutionary tabu search strategies enable the building of an atomic model for the entire system by considering simultaneously the different components. Such registration indirectly introduces spatial constrains as all components need to be placed within the assembly, enabling the proper docked in the low-resolution map of the entire assembly. Along with the method description, the manuscript covers the validation, presenting the benefit of the technique in both synthetic and experimental test cases. Such approach successfully docked multiple components up to resolutions of 40Å. The third manuscript is entitled Evolutionary Bidirectional Expansion for the Annotation of Alpha Helices in Electron Cryo-Microscopy Reconstructions and was submitted for publication in the Journal of Structural Biology. The modeling approach described in this manuscript applies the evolutionary tabu search strategies in combination with the bidirectional expansion to annotate secondary structure elements in intermediate resolution cryo-EM reconstructions. In particular, secondary structure elements such as alpha helices show consistent patterns in cryo-EM data, and are visible as rod-like patterns of high density. The evolutionary tabu search strategy is applied to identify the placement of the different alpha helices, while the bidirectional expansion characterizes their length and curvature. The manuscript presents the validation of the approach at resolutions ranging between 6 and 14Å, a level of detail where alpha helices are visible. Up to resolution of 12 Å, the method measures sensitivities between 70-100% as estimated in experimental test cases, i.e. 70-100% of the alpha-helices were correctly predicted in an automatic manner in the experimental data. The three manuscripts presented in this PhD dissertation cover different computation methods for the integration and interpretation of cryo-EM reconstructions. The methods were developed in the molecular modeling software Sculptor (http://sculptor.biomachina.org) and are available for the scientific community interested in the multi-resolution modeling of cryo-EM data. The work spans a wide range of resolution covering multi-body refinement and registration at low-resolution along with annotation of consistent patterns at high-resolution. Such methods are essential for the modeling of cryo-EM data, and may be applied in other fields where similar spatial problems are encountered, such as medical imaging.
Resumo:
The hierarchical properties of potential energy landscapes have been used to gain insight into thermodynamic and kinetic properties of protein ensembles. It also may be possible to use them to direct computational searches for thermodynamically stable macroscopic states, i.e., computational protein folding. To this end, we have developed a top-down search procedure in which conformation space is recursively dissected according to the intrinsic hierarchical structure of a landscape's effective-energy barriers. This procedure generates an inverted tree similar to the disconnectivity graphs generated by local minima-clustering methods, but it fundamentally differs in the manner in which the portion of the tree that is to be computationally explored is selected. A key ingredient is a branch-selection algorithm that takes advantage of statistically predictive properties of the landscape to guide searches down the tree branches that are most likely to lead to the physically relevant macroscopic states. Using the computational folding of a β-hairpin-forming peptide as an example, we show that such predictive properties indeed exist and can be used for structure prediction by free-energy global minimization.
Resumo:
A Biologia Computacional tem desenvolvido algoritmos aplicados a problemas relevantes da Biologia. Um desses problemas é a Protein Structure Prediction (PSP). Vários métodos têm sido desenvolvidos na literatura para lidar com esse problema. Porém a reprodução de resultados e a comparação dos mesmos não têm sido uma tarefa fácil. Nesse sentido, o Critical Assessment of protein Structure Prediction (CASP), busca entre seus objetivos, realizar tais comparações. Além disso, os sistemas desenvolvidos para esse problema em geral não possuem interface amigável, não favorecendo o uso por não especialistas da computação. Buscando reduzir essas dificuldades, este trabalho propões o Koala, um sistema baseado em uma plataforma web, que integra vários métodos de predição e análises de estruturas de proteínas, possibilitando a execução de experimentos complexos com o uso de fluxos de trabalhos. Os métodos de predição disponíveis podem ser integrados para a realização de análises dos resultados, usando as métricas RMSD, GDT-TS ou TM-Score. Além disso, o método Sort by front dominance (baseado no critério de optimalidade de Pareto), proposto nesse trabalho, consegue avaliar predições sem uma estrutura de referência. Os resultados obtidos, usando proteínas alvo de artigos recentes e do CASP11, indicam que o Koala tem capacidade de realizar um conjunto relativamente grande de experimentos estruturados, beneficiando a determinação de melhores estruturas de proteínas, bem como o desenvolvimento de novas abordagens para predição e análise por meio de fluxos de trabalho.
Resumo:
Bacterial chaperonin, GroEL, together with its co-chaperonin, GroES, facilitates the folding of a variety of polypeptides. Experiments suggest that GroEL stimulates protein folding by multiple cycles of binding and release. Misfolded proteins first bind to an exposed hydrophobic surface on GroEL. GroES then encapsulates the substrate and triggers its release into the central cavity of the GroEL/ES complex for folding. In this work, we investigate the possibility to facilitate protein folding in molecular dynamics simulations by mimicking the effects of GroEL/ES namely, repeated binding and release, together with spatial confinement. During the binding stage, the (metastable) partially folded proteins are allowed to attach spontaneously to a hydrophobic surface within the simulation box. This destabilizes the structures, which are then transferred into a spatially confined cavity for folding. The approach has been tested by attempting to refine protein structural models generated using the ROSETTA procedure for ab initio structure prediction. Dramatic improvements in regard to the deviation of protein models from the corresponding experimental structures were observed. The results suggest that the primary effects of the GroEL/ES system can be mimicked in a simple coarse-grained manner and be used to facilitate protein folding in molecular dynamics simulations. Furthermore, the results Sur port the assumption that the spatial confinement in GroEL/ES assists the folding of encapsulated proteins.
Resumo:
Purple acid phosphatases are a family of binuclear metallohydrolases that have been identified in plants, animals and fungi. Only one isoform of similar to 35 kDa has been isolated from animals, where it is associated with bone resorption and microbial killing through its phosphatase activity, and hydroxyl radical production, respectively. Using the sensitive PSI-BLAST search method, sequences representing new purple acid phosphatase-like proteins have been identified in mammals, insects and nematodes. These new putative isoforms are closely related to the similar to 55 kDa purple acid phosphatase characterized from plants. Secondary structure prediction of the new human isoform further confirms its similarity to a purple acid phosphatase from the red kidney bean. A structural model for the human enzyme was constructed based on the red kidney bean purple acid phosphatase structure. This model shows that the catalytic centre observed in other purple acid phosphatases is also present in this new isoform. These observations suggest that the sequences identified in this study represent a novel subfamily of plant-like purple acid phosphatases in animals and humans. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Hydrophobins are small (similar to 100 aa) proteins that have an important role in the growth and development of mycelial fungi. They are surface active and, after secretion by the fungi, self-assemble into amphipathic membranes at hydrophobic/hydrophilic interfaces, reversing the hydrophobicity of the surface. In this study, molecular dynamics simulation techniques have been used to model the process by which a specific class I hydrophobin, SC3, binds to a range of hydrophobic/ hydrophilic interfaces. The structure of SC3 used in this investigation was modeled based on the crystal structure of the class II hydrophobin HFBII using the assumption that the disulfide pairings of the eight conserved cysteine residues are maintained. The proposed model for SC3 in aqueous solution is compact and globular containing primarily P-strand and coil structures. The behavior of this model of SC3 was investigated at an air/water, an oil/water, and a hydrophobic solid/water interface. It was found that SC3 preferentially binds to the interfaces via the loop region between the third and fourth cysteine residues and that binding is associated with an increase in a-helix formation in qualitative agreement with experiment. Based on a combination of the available experiment data and the current simulation studies, we propose a possible model for SC3 self-assembly on a hydrophobic solid/water interface.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.
Resumo:
Background: Designing novel proteins with site-directed recombination has enormous prospects. By locating effective recombination sites for swapping sequence parts, the probability that hybrid sequences have the desired properties is increased dramatically. The prohibitive requirements for applying current tools led us to investigate machine learning to assist in finding useful recombination sites from amino acid sequence alone. Results: We present STAR, Site Targeted Amino acid Recombination predictor, which produces a score indicating the structural disruption caused by recombination, for each position in an amino acid sequence. Example predictions contrasted with those of alternative tools, illustrate STAR'S utility to assist in determining useful recombination sites. Overall, the correlation coefficient between the output of the experimentally validated protein design algorithm SCHEMA and the prediction of STAR is very high (0.89). Conclusion: STAR allows the user to explore useful recombination sites in amino acid sequences with unknown structure and unknown evolutionary origin. The predictor service is available from http://pprowler.itee.uq.edu.au/star.
Resumo:
DNA methylation appears to be involved in the regulation of gene expression. Transcriptionally inactive (silenced) genes normally contain a high proportion of 5-methyl-2'-deoxycytosine residues whereas transcriptionally active genes show much reduced levels. There appears good reason to believe that chemical agents capable of methylating 2'-deoxycytosine might affect gene expression and as a result of hypermethylating promoter regions of cytosine-guanine rich oncogenic sequences, cancer related genes may be silenced. This thesis describes the synthesis of a number of `electrophilic' S-methylsulphonium compounds and assesses their ability to act as molecules capable of methylating cytosine at position 5 and also considers their potential as cytotoxic agents. DNA is methylated in vivo by DNA methyltransferase utilising S-adenoxylmethionine as the methyl donor. This thesis addresses the theory that S-adenoxylmethionine may be replaced as the methyl donor for DNA methytransferase by other sulphonium compounds. S-[3H-methyl]methionine sulphonium iodide was synthesised and experiments to assess the ability of this compounds to transfer methyl groups to cytosine in the presence of DNA methyltransferase were unsuccessful. A proline residue adjacent to a cysteine residue has been identified to a highly conserved feature of the active site region of a large number of prokaryotic DNA methyltransferases. The thesis examines the possibility that short peptides containing the Pro-Cys fragment may be able to facilitate the alkylation of cytosine position 5 by sulphonium compounds. Peptides were synthesised up to 9 amino acids in length but none were shown to exhibit significant activity. Molecular modelling techniques, including Chem-X, Quanta, BIPED and protein structure prediction programs were used to assess any structural similarities that may exist between short peptides containing a Pro-Cys fragment and similar sequences present in proteins. A number of similar structural features were observed.
Resumo:
MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.
Resumo:
Accurate protein structure prediction remains an active objective of research in bioinformatics. Membrane proteins comprise approximately 20% of most genomes. They are, however, poorly tractable targets of experimental structure determination. Their analysis using bioinformatics thus makes an important contribution to their on-going study. Using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we have addressed the alignment-free discrimination of membrane from non-membrane proteins. The method successfully identifies prokaryotic and eukaryotic α-helical membrane proteins at 94.4% accuracy, β-barrel proteins at 72.4% accuracy, and distinguishes assorted non-membranous proteins with 85.9% accuracy. The method here is an important potential advance in the computational analysis of membrane protein structure. It represents a useful tool for the characterisation of membrane proteins with a wide variety of potential applications.