11 resultados para program structure
em Indian Institute of Science - Bangalore - Índia
Resumo:
As the gap between processor and memory continues to grow Memory performance becomes a key performance bottleneck for many applications. Compilers therefore increasingly seek to modify an application’s data layout to improve cache locality and cache reuse. Whole program Structure Layout [WPSL] transformations can significantly increase the spatial locality of data and reduce the runtime of programs that use link-based data structures, by increasing the cache line utilization. However, in production compilers WPSL transformations do not realize the entire performance potential possible due to a number of factors. Structure layout decisions made on the basis of whole program aggregated affinity/hotness of structure fields, can be sub optimal for local code regions. WPSL is also restricted in applicability in production compilers for type unsafe languages like C/C++ due to the extensive legality checks and field sensitive pointer analysis required over the entire application. In order to overcome the issues associated with WPSL, we propose Region Based Structure Layout (RBSL) optimization framework, using selective data copying. We describe our RBSL framework, implemented in the production compiler for C/C++ on HP-UX IA-64. We show that acting in complement to the existing and mature WPSL transformation framework in our compiler, RBSL improves application performance in pointer intensive SPEC benchmarks ranging from 3% to 28% over WPSL
Resumo:
Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Resumo:
Using the link-link incidence matrix to represent a simple-jointed kinematic chain algebraic procedures have been developed to determine its structural characteristics such as the type of freedom of the chain, the number of distinct mechanisms and driving mechanisms that can be derived from the chain. A computer program incorporating these graph theory based procedures has been applied successfully for the structural analysis of several typical chains.
Resumo:
The structure of bovine prothrombin fragment 1 has been refined at 2.25 Å resolution using high resolution measurements made with the synchrotron beam at CHESS. The synchrotron data were collected photographically by oscillation methods (R-merge = 0.08). These were combined with lower order diffractometer data for refinement purposes. The structure was refined using restrained least-squares methods with the program PROLSQ to a crystallographic R-value of 0.175. The structure includes 105 water molecules with occupancies of >0·6. The first 35 residues (Ala1-Leu35) of the N-terminal ?-carboxy glutamic acid-domain (Ala1-Cys48) of fragment 1 are disordered as are two carbohydrate chains of Mr ? 5000; the latter two combine to render 40% of the structure disordered. The folding of the kringle of fragment 1 is related to the close intramolecular contact between the inner loop disulfide groups. Half of the conserved sequence of the kringle forms an inner core surrounding these disulfide groups. The remainder of the sequence conservation is associated with the many turns of the main chain. The Pro95 residue of the kringle has a cis conformation and Tyr74 is ordered in fragment 1, although nuclear magnetic resonance studies indicate that the comparable residue of plasminogen kringle 4 has two positions. Surface accessibility calculations indicate that none of the disulfide groups of fragment 1 is accessible to solvent.
Resumo:
A detailed analysis of structural and position dependent characteristic features of helices will give a better understanding of the secondary structure formation in globular proteins. Here we describe an algorithm that quantifies the geometry of helices in proteins on the basis of their C-alpha atoms alone. The Fortran program HELANAL can extract the helices from the PDB files and then characterises the overall geometry of each helix as being linear, curved or kinked, in terms of its local structural features, viz. local helical twist and rise, virtual torsion angle, local helix origins and bending angles between successive local helix axes. Even helices with large radius of curvature are unambiguously identified as being linear or curved. The program can also be used to differentiate a kinked helix and other motifs, such as helix-loop-helix or a helix-turn-helix (with a single residue linker) with the help of local bending angles. In addition to these, the program can also be used to characterise the helix start and end as well as other types of secondary structures.
Resumo:
Structural alignments are the most widely used tools for comparing proteins with low sequence similarity. The main contribution of this paper is to derive various kernels on proteins from structural alignments, which do not use sequence information. Central to the kernels is a novel alignment algorithm which matches substructures of fixed size using spectral graph matching techniques. We derive positive semi-definite kernels which capture the notion of similarity between substructures. Using these as base more sophisticated kernels on protein structures are proposed. To empirically evaluate the kernels we used a 40% sequence non-redundant structures from 15 different SCOP superfamilies. The kernels when used with SVMs show competitive performance with CE, a state of the art structure comparison program.
Resumo:
The region spanning residues 95-146 of the rotavirus nonstructural protein NSP4 from the asymptomatic human strain ST3 has been purified and crystallized and diffraction data have been collected to a resolution of 2.6 angstrom. Several attempts to solve the structure by the molecular-replacement method using the available tetrameric structures of this domain were unsuccessful despite a sequence identity of 73% to the already known structures. A more systematic approach with a dimer as the search model led to an unexpected pentameric structure using the program Phaser. The various steps involved in arriving at this molecular-replacement solution, which unravelled a case of subtle variation between different oligomeric states unknown at the time of solving the structure, are presented in this paper.
Resumo:
The rapidly growing structure databases enhance the probability of finding identical sequences sharing structural similarity. Structure prediction methods are being used extensively to abridge the gap between known protein sequences and the solved structures which is essential to understand its specific biochemical and cellular functions. In this work, we plan to study the ambiguity between sequence-structure relationships and examine if sequentially identical peptide fragments adopt similar three-dimensional structures. Fragments of varying lengths (five to ten residues) were used to observe the behavior of sequence and its three-dimensional structures. The STAMP program was used to superpose the three-dimensional structures and the two parameters (Sequence Structure Similarity Score (Sc) and Root Mean Square Deviation value) were employed to classify them into three categories: similar, intermediate and dissimilar structures. Furthermore, the same approach was carried out on all the three-dimensional protein structures solved in the two organisms, Mycobacterium tuberculosis and Plasmodium falciparum to validate our results.
Resumo:
Network theory applied to protein structures provides insights into numerous problems of biological relevance. The explosion in structural data available from PDB and simulations establishes a need to introduce a standalone-efficient program that assembles network concepts/parameters under one hood in an automated manner. Herein, we discuss the development/application of an exhaustive, user-friendly, standalone program package named PSN-Ensemble, which can handle structural ensembles generated through molecular dynamics (MD) simulation/NMR studies or from multiple X-ray structures. The novelty in network construction lies in the explicit consideration of side-chain interactions among amino acids. The program evaluates network parameters dealing with topological organization and long-range allosteric communication. The introduction of a flexible weighing scheme in terms of residue pairwise cross-correlation/interaction energy in PSN-Ensemble brings in dynamical/chemical knowledge into the network representation. Also, the results are mapped on a graphical display of the structure, allowing an easy access of network analysis to a general biological community. The potential of PSN-Ensemble toward examining structural ensemble is exemplified using MD trajectories of an ubiquitin-conjugating enzyme (UbcH5b). Furthermore, insights derived from network parameters evaluated using PSN-Ensemble for single-static structures of active/inactive states of 2-adrenergic receptor and the ternary tRNA complexes of tyrosyl tRNA synthetases (from organisms across kingdoms) are discussed. PSN-Ensemble is freely available from http://vishgraph.mbu.iisc.ernet.in/PSN-Ensemble/psn_index.html.
Resumo:
Pyridoxal kinase (PdxK; EC 2.7.1.35) belongs to the phosphotransferase family of enzymes and catalyzes the conversion of the three active forms of vitamin B-6, pyridoxine, pyridoxal and pyridoxamine, to their phosphorylated forms and thereby plays a key role in pyridoxal 5 `-phosphate salvage. In the present study, pyridoxal kinase from Salmonella typhimurium was cloned and overexpressed in Escherichia coli, purified using Ni-NTA affinity chromatography and crystallized. X-ray diffraction data were collected to 2.6 angstrom resolution at 100 K. The crystal belonged to the primitive orthorhombic space group P2(1)2(1)2(1), with unitcell parameters a = 65.11, b = 72.89, c = 107.52 angstrom. The data quality obtained by routine processing was poor owing to the presence of strong diffraction rings caused by a polycrystalline material of an unknown small molecule in all oscillation images. Excluding the reflections close to powder/polycrystalline rings provided data of sufficient quality for structure determination. A preliminary structure solution has been obtained by molecular replacement with the Phaser program in the CCP4 suite using E. coli pyridoxal kinase (PDB entry 2ddm) as the phasing model. Further refinement and analysis of the structure are likely to provide valuable insights into catalysis by pyridoxal kinases.
Resumo:
Identification and analysis of nonbonded interactions within a molecule and with the surrounding molecules are an essential part of structural studies, given the importance of these interactions in defining the structure and function of any supramolecular entity. MolBridge is an easy to use algorithm based purely on geometric criteria that can identify all possible nonbonded interactions, such as hydrogen bond, halogen bond, cation-pi, pi-pi and van der Waals, in small molecules as well as biomolecules. The user can either upload three-dimensional coordinate files or enter the molecular ID corresponding to the relevant database. The program is available in a standalone form and as an interactive web server with Jmol and JME incorporated into it. The program is freely downloadable and the web server version is also available at http://nucleix.mbu.iisc.ernet.in/molbridge/index.php.