946 resultados para WEB SERVER
Resumo:
Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.
Resumo:
Lattice-based cryptographic primitives are believed to offer resilience against attacks by quantum computers. We demonstrate the practicality of post-quantum key exchange by constructing cipher suites for the Transport Layer Security (TLS) protocol that provide key exchange based on the ring learning with errors (R-LWE) problem, we accompany these cipher suites with a rigorous proof of security. Our approach ties lattice-based key exchange together with traditional authentication using RSA or elliptic curve digital signatures: the post-quantum key exchange provides forward secrecy against future quantum attackers, while authentication can be provided using RSA keys that are issued by today's commercial certificate authorities, smoothing the path to adoption. Our cryptographically secure implementation, aimed at the 128-bit security level, reveals that the performance price when switching from non-quantum-safe key exchange is not too high. With our R-LWE cipher suites integrated into the Open SSL library and using the Apache web server on a 2-core desktop computer, we could serve 506 RLWE-ECDSA-AES128-GCM-SHA256 HTTPS connections per second for a 10 KiB payload. Compared to elliptic curve Diffie-Hellman, this means an 8 KiB increased handshake size and a reduction in throughput of only 21%. This demonstrates that provably secure post-quantum key-exchange can already be considered practical.
Resumo:
Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php
Resumo:
Repeats are two or more contiguous segments of amino acid residues that are believed to have arisen as a result of intragenic duplication, recombination and mutation events. These repeats can be utilized for protein structure prediction and can provide insights into the protein evolution and phylogenetic relationship. Therefore, to aid structural biologists and phylogeneticists in their research, a computing resource (a web server and a database), Repeats in Protein Sequences (RPS), has been created. Using RPS, users can obtain useful information regarding identical, similar and distant repeats (of varying lengths) in protein sequences. In addition, users can check the frequency of occurrence of the repeats in sequence databases such as the Genome Database, PIR and SWISS-PROT and among the protein sequences available in the Protein Data Bank archive. Furthermore, users can view the three-dimensional structure of the repeats using the Java visualization plug-in Jmol. The proposed computing resource can be accessed over the World Wide Web at http://bioserver1.physics.iisc.ernet.in/rps/.
Resumo:
With the immense growth in the number of available protein structures, fast and accurate structure comparison has been essential. We propose an efficient method for structure comparison, based on a structural alphabet. Protein Blocks (PBs) is a widely used structural alphabet with 16 pentapeptide conformations that can fairly approximate a complete protein chain. Thus a 3D structure can be translated into a 1D sequence of PBs. With a simple Needleman-Wunsch approach and a raw PB substitution matrix, PB-based structural alignments were better than many popular methods. iPBA web server presents an improved alignment approach using (i) specialized PB Substitution Matrices (SM) and (ii) anchor-based alignment methodology. With these developments, the quality of similar to 88% of alignments was improved. iPBA alignments were also better than DALI, MUSTANG and GANGSTA(+) in > 80% of the cases. The webserver is designed to for both pairwise comparisons and database searches. Outputs are given as sequence alignment and superposed 3D structures displayed using PyMol and Jmol. A local alignment option for detecting subs-structural similarity is also embedded. As a fast and efficient `sequence-based' structure comparison tool, we believe that it will be quite useful to the scientific community. iPBA can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/ipba/.
Resumo:
Background: Sensitive remote homology detection and accurate alignments especially in the midnight zone of sequence similarity are needed for better function annotation and structural modeling of proteins. An algorithm, AlignHUSH for HMM-HMM alignment has been developed which is capable of recognizing distantly related domain families The method uses structural information, in the form of predicted secondary structure probabilities, and hydrophobicity of amino acids to align HMMs of two sets of aligned sequences. The effect of using adjoining column(s) information has also been investigated and is found to increase the sensitivity of HMM-HMM alignments and remote homology detection. Results: We have assessed the performance of AlignHUSH using known evolutionary relationships available in SCOP. AlignHUSH performs better than the best HMM-HMM alignment methods and is observed to be even more sensitive at higher error rates. Accuracy of the alignments obtained using AlignHUSH has been assessed using the structure-based alignments available in BaliBASE. The alignment length and the alignment quality are found to be appropriate for homology modeling and function annotation. The alignment accuracy is found to be comparable to existing methods for profile-profile alignments. Conclusions: A new method to align HMMs has been developed and is shown to have better sensitivity at error rates of 10% and above when compared to other available programs. The proposed method could effectively aid obtaining clues to functions of proteins of yet unknown function. A web-server incorporating the AlignHUSH method is available at http://crick.mbu.iisc.ernet.in/similar to alignhush/
Resumo:
A computational pipeline PocketAnnotate for functional annotation of proteins at the level of binding sites has been proposed in this study. The pipeline integrates three in-house algorithms for site-based function annotation: PocketDepth, for prediction of binding sites in protein structures; PocketMatch, for rapid comparison of binding sites and PocketAlign, to obtain detailed alignment between pair of binding sites. A novel scheme has been developed to rapidly generate a database of non-redundant binding sites. For a given input protein structure, putative ligand-binding sites are identified, matched in real time against the database and the query substructure aligned with the promising hits, to obtain a set of possible ligands that the given protein could bind to. The input can be either whole protein structures or merely the substructures corresponding to possible binding sites. Structure-based function annotation at the level of binding sites thus achieved could prove very useful for cases where no obvious functional inference can be obtained based purely on sequence or fold-level analyses. An attempt has also been made to analyse proteins of no known function from Protein Data Bank. PocketAnnotate would be a valuable tool for the scientific community and contribute towards structure-based functional inference. The web server can be freely accessed at http://proline.biochem.iisc.ernet.in/pocketannotate/.
Resumo:
Protein structure space is believed to consist of a finite set of discrete folds, unlike the protein sequence space which is astronomically large, indicating that proteins from the available sequence space are likely to adopt one of the many folds already observed. In spite of extensive sequence-structure correlation data, protein structure prediction still remains an open question with researchers having tried different approaches (experimental as well as computational). One of the challenges of protein structure prediction is to identify the native protein structures from a milieu of decoys/models. In this work, a rigorous investigation of Protein Structure Networks (PSNs) has been performed to detect native structures from decoys/ models. Ninety four parameters obtained from network studies have been optimally combined with Support Vector Machines (SVM) to derive a general metric to distinguish decoys/models from the native protein structures with an accuracy of 94.11%. Recently, for the first time in the literature we had shown that PSN has the capability to distinguish native proteins from decoys. A major difference between the present work and the previous study is to explore the transition profiles at different strengths of non-covalent interactions and SVM has indeed identified this as an important parameter. Additionally, the SVM trained algorithm is also applied to the recent CASP10 predicted models. The novelty of the network approach is that it is based on general network properties of native protein structures and that a given model can be assessed independent of any reference structure. Thus, the approach presented in this paper can be valuable in validating the predicted structures. A web-server has been developed for this purpose and is freely available at http://vishgraph.mbu.iisc.ernet.in/GraProStr/PSN-QA.html.
Resumo:
The increasing number of available protein structures requires efficient tools for multiple structure comparison. Indeed, multiple structural alignments are essential for the analysis of function, evolution and architecture of protein structures. For this purpose, we proposed a new web server called multiple Protein Block Alignment (mulPBA). This server implements a method based on a structural alphabet to describe the backbone conformation of a protein chain in terms of dihedral angles. This sequence-like' representation enables the use of powerful sequence alignment methods for primary structure comparison, followed by an iterative refinement of the structural superposition. This approach yields alignments superior to most of the rigid-body alignment methods and highly comparable with the flexible structure comparison approaches. We implement this method in a web server designed to do multiple structure superimpositions from a set of structures given by the user. Outputs are given as both sequence alignment and superposed 3D structures visualized directly by static images generated by PyMol or through a Jmol applet allowing dynamic interaction. Multiple global quality measures are given. Relatedness between structures is indicated by a distance dendogram. Superimposed structures in PDB format can be also downloaded, and the results are quickly obtained. mulPBA server can be accessed at www.dsimb.inserm.fr/dsimb_tools/mulpba/.
Resumo:
Elasticity in cloud systems provides the flexibility to acquire and relinquish computing resources on demand. However, in current virtualized systems resource allocation is mostly static. Resources are allocated during VM instantiation and any change in workload leading to significant increase or decrease in resources is handled by VM migration. Hence, cloud users tend to characterize their workloads at a coarse grained level which potentially leads to under-utilized VM resources or under performing application. A more flexible and adaptive resource allocation mechanism would benefit variable workloads, such as those characterized by web servers. In this paper, we present an elastic resources framework for IaaS cloud layer that addresses this need. The framework provisions for application workload forecasting engine, that predicts at run-time the expected demand, which is input to the resource manager to modulate resource allocation based on the predicted demand. Based on the prediction errors, resources can be over-allocated or under-allocated as compared to the actual demand made by the application. Over-allocation leads to unused resources and under allocation could cause under performance. To strike a good trade-off between over-allocation and under-performance we derive an excess cost model. In this model excess resources allocated are captured as over-allocation cost and under-allocation is captured as a penalty cost for violating application service level agreement (SLA). Confidence interval for predicted workload is used to minimize this excess cost with minimal effect on SLA violations. An example case-study for an academic institute web server workload is presented. Using the confidence interval to minimize excess cost, we achieve significant reduction in resource allocation requirement while restricting application SLA violations to below 2-3%.
Resumo:
Identification and analysis of nonbonded interactions within a molecule and with the surrounding molecules are an essential part of structural studies, given the importance of these interactions in defining the structure and function of any supramolecular entity. MolBridge is an easy to use algorithm based purely on geometric criteria that can identify all possible nonbonded interactions, such as hydrogen bond, halogen bond, cation-pi, pi-pi and van der Waals, in small molecules as well as biomolecules. The user can either upload three-dimensional coordinate files or enter the molecular ID corresponding to the relevant database. The program is available in a standalone form and as an interactive web server with Jmol and JME incorporated into it. The program is freely downloadable and the web server version is also available at http://nucleix.mbu.iisc.ernet.in/molbridge/index.php.
Resumo:
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as Protein Blocks (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa.
Resumo:
Background: Understanding channel structures that lead to active sites or traverse the molecule is important in the study of molecular functions such as ion, ligand, and small molecule transport. Efficient methods for extracting, storing, and analyzing protein channels are required to support such studies. Further, there is a need for an integrated framework that supports computation of the channels, interactive exploration of their structure, and detailed visual analysis of their properties. Results: We describe a method for molecular channel extraction based on the alpha complex representation. The method computes geometrically feasible channels, stores both the volume occupied by the channel and its centerline in a unified representation, and reports significant channels. The representation also supports efficient computation of channel profiles that help understand channel properties. We describe methods for effective visualization of the channels and their profiles. These methods and the visual analysis framework are implemented in a software tool, CHEXVIS. We apply the method on a number of known channel containing proteins to extract pore features. Results from these experiments on several proteins show that CHEXVIS performance is comparable to, and in some cases, better than existing channel extraction techniques. Using several case studies, we demonstrate how CHEXVIS can be used to study channels, extract their properties and gain insights into molecular function. Conclusion: CHEXVIS supports the visual exploration of multiple channels together with their geometric and physico-chemical properties thereby enabling the understanding of the basic biology of transport through protein channels. The CHEXVIS web-server is freely available at http://vgl.serc.iisc.ernet.in/chexvis/. The web-server is supported on all modern browsers with latest Java plug-in.
Resumo:
Secondary-structure elements (SSEs) play an important role in the folding of proteins. Identification of SSEs in proteins is a common problem in structural biology. A new method, ASSP (Assignment of Secondary Structure in Proteins), using only the path traversed by the C atoms has been developed. The algorithm is based on the premise that the protein structure can be divided into continuous or uniform stretches, which can be defined in terms of helical parameters, and depending on their values the stretches can be classified into different SSEs, namely -helices, 3(10)-helices, -helices, extended -strands and polyproline II (PPII) and other left-handed helices. The methodology was validated using an unbiased clustering of these parameters for a protein data set consisting of 1008 protein chains, which suggested that there are seven well defined clusters associated with different SSEs. Apart from -helices and extended -strands, 3(10)-helices and -helices were also found to occur in substantial numbers. ASSP was able to discriminate non--helical segments from flanking -helices, which were often identified as part of -helices by other algorithms. ASSP can also lead to the identification of novel SSEs. It is believed that ASSP could provide a better understanding of the finer nuances of protein secondary structure and could make an important contribution to the better understanding of comparatively less frequently occurring structural motifs. At the same time, it can contribute to the identification of novel SSEs. A standalone version of the program for the Linux as well as the Windows operating systems is freely downloadable and a web-server version is also available at .
Resumo:
Nesta dissertação foi desenvolvido o sistema SAQUA (Sistema para Análise da Qualidade das Águas Fluviais), que permite o acompanhamento dos dados de séries históricas de parâmetros físico-químicos para análise da qualidade de águas fluviais. A alimentação do sistema SAQUA se dá a partir do arquivo tipo texto gerado no Hidroweb, sistema de banco de dados hidrológicos da ANA (Agência Nacional de Águas), disponibilizado na internet. O SAQUA constitui uma interface que permite a análise espaço-temporal de parâmetros de qualidade da água específicos definidos pelo usuário. A interface foi construída utilizando o servidor de mapas Mapserver, as linguagens HTML e PHP, além de consultas SQL e o uso do servidor Web Apache. A utilização de uma linguagem dinâmica como o PHP permitiu usar recursos internos do Mapserver por meio de funções que interagem de forma mais flexível com códigos presentes e futuros, além de interagir com o código HTML. O Sistema apresenta como resultado a representação gráfica da série histórica por parâmetro e, em mapa, a localização das estações em análise também definidas pelo usuário, geralmente associadas a uma determinada região hidrográfica. Tanto na representação gráfica da série temporal quanto em mapa, são destacados a partir de código de cores a estação de monitoramento e a observação em que os limites estabelecidos na resolução CONAMA 357/05 não foi atendido. A classe de uso da resolução CONAMA que será usada na análise também pode ser definida pelo usuário. Como caso de estudo e demonstração das funções do SAQUA foi escolhida a bacia hidrográfica do rio Paraíba do Sul, localizada na região hidrográfica Atlântico Sudeste do Brasil. A aplicação do sistema demonstrou ótimos resultados e o potencial da ferramenta computacional como apoio ao planejamento e à gestão dos recursos hídricos. Ressalta-se ainda, que todo o sistema foi desenvolvido a partir de softwares disponibilizados segundo a licença GPL de software livre, ou seja, sem custo na aquisição de licenças, demonstrando o potencial da aplicação destas ferramentas no campo dos recursos hídricos.