975 resultados para Structure comparison
Resumo:
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.
Resumo:
The database reported here is derived using the Combinatorial Extension (CE) algorithm which compares pairs of protein polypeptide chains and provides a list of structurally similar proteins along with their structure alignments. Using CE, structure–structure alignments can provide insights into biological function. When a protein of known function is shown to be structurally similar to a protein of unknown function, a relationship might be inferred; a relationship not necessarily detectable from sequence comparison alone. Establishing structure–structure relationships in this way is of great importance as we enter an era of structural genomics where there is a likelihood of an increasing number of structures with unknown functions being determined. Thus the CE database is an example of a useful tool in the annotation of protein structures of unknown function. Comparisons can be performed on the complete PDB or on a structurally representative subset of proteins. The source protein(s) can be from the PDB (updated monthly) or uploaded by the user. CE provides sequence alignments resulting from structural alignments and Cartesian coordinates for the aligned structures, which may be analyzed using the supplied Compare3D Java applet, or downloaded for further local analysis. Searches can be run from the CE web site, http://cl.sdsc.edu/ce.html, or the database and software downloaded from the site for local use.
Resumo:
We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., blast and fasta validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.
Resumo:
The theoretical improvements performed since the last spacecraft and mechanical testing conference on the study of the pyrotechnic shock phenomena produced during the separation of the lower stage of the Ariane 5 Vehicle Equipment Bay (VEB) structure are described. The first theoretical approach used was based on the wave propagation method, including axial and shear waves. The method was changed, in order to capture the bending effects, as well as the influence of the frequency dependent damping values. In addition to the development of the theoretical method, efforts were made to improve the criteria used to model the structure. Comparison of the theoretical predictions with the test results of a flat test sample 1 m width, as well as a preliminary test performed on a small sample, are presented.
Resumo:
Computing the similarity between two protein structures is a crucial task in molecular biology, and has been extensively investigated. Many protein structure comparison methods can be modeled as maximum weighted clique problems in specific k-partite graphs, referred here as alignment graphs. In this paper we present both a new integer programming formulation for solving such clique problems and a dedicated branch and bound algorithm for solving the maximum cardinality clique problem. Both approaches have been integrated in VAST, a software for aligning protein 3D structures largely used in the National Center for Biotechnology Information, an original clique solver which uses the well known Bron and Kerbosch algorithm (BK). Our computational results on real protein alignment instances show that our branch and bound algorithm is up to 116 times faster than BK.
Resumo:
Background: Protein structural alignment is one of the most fundamental and crucial areas of research in the domain of computational structural biology. Comparison of a protein structure with known structures helps to classify it as a new or belonging to a known group of proteins. This, in turn, is useful to determine the function of protein, its evolutionary relationship with other protein molecules and grasping principles underlying protein architecture and folding. Results: A large number of protein structure alignment methods are available. Each protein structure alignment tool has its own strengths andweaknesses that need to be highlighted.We compared and presented results of six most popular and publically available servers for protein structure comparison. These web-based servers were compared with the respect to functionality (features provided by these servers) and accuracy (how well the structural comparison is performed). The CATH was used as a reference. The results showed that overall CE was top performer. DALI and PhyreStorm showed similar results whereas PDBeFold showed the lowest performance. In case of few secondary structural elements, CE, DALI and PhyreStorm gave 100% success rate. Conclusion: Overall none of the structural alignment servers showed 100% success rate. Studies of overall performance, effect of mainly alpha and effect of mainly beta showed consistent performance. CE, DALI, FatCat and PhyreStorm showed more than 90% success rate.
Resumo:
A model for binary mixture adsorption accounting for energetic heterogeneity and intermolecular interactions is proposed in this paper. The model is based on statistical thermodynamics, and it is able to describe molecular rearrangement of a mixture in a nonuniform adsorption field inside a cavity. The Helmholtz free energy obtained in the framework of this approach has upper and lower limits, which define a permissible range in which all possible solutions will be found. One limit corresponds to a completely chaotic distribution of molecules within a cavity, while the other corresponds to a maximum ordered molecular structure. Comparison of the nearly ideal O-2-N-2-zeolite NaX system at ambient temperature with the system Of O-2-N-2-zeolite CaX at 144 K has shown that a decrease of temperature leads to a molecular rearrangement in the cavity volume, which results from the difference in the fluid-solid interactions. The model is able to describe this behavior and therefore allows predicting mixture adsorption more accurately compared to those assuming energetic uniformity of the adsorption volume. Another feature of the model is its ability to correctly describe the negative deviations from Raoult's law exhibited by the O-2-N-2-CaX system at 144 K. Analysis of the highly nonideal CO2-C2H6-zeolite NaX system has shown that the spatial molecular rearrangement in separate cavities is induced by not only the ion-quadrupole interaction of the CO2 molecule but also the significant difference in molecular size and the difference between the intermolecular interactions of molecules of the same species and those of molecules of different species. This leads to the highly ordered structure of this system.
Resumo:
Diplomityössä on tutkittu ultralujien terästen käyttömahdollisuuksia puutavaranosturin puomirakenteissa. Työn tavoitteena on ollut suunnitella uusi puomikonstruktio, joka on kevyempi kuin aiemmat ratkaisut, mutta täyttää kuitenkin sovellettavien standardien asettamat vaatimukset väsymisen, staattisen kestävyyden ja stabiliteetin suhteen. Työ on tehty vertailemalla erilaisten uusien profiilien ja rakenneyksityiskohtien ominaisuuksia toisiinsa ja vanhaan rakenteeseen. Vertailukriteereinä on käytetty elastista taivutuskapasiteettia, kriittistä lommahduskuormaa, väsymislujuutta sekä hitsaus- ja materiaalikustannuksia. Työn lopputuloksena saatiin ehdotus uudeksi teleskooppipuomiksi, joka on noin 8 % kevyempi kuin nykyinen ratkaisu.
Resumo:
Multibody simulation model of the roller test rig is presented in this work. The roller test rig consists of a paper machine’s tube roll supported with a hard bearing type balancing machine. The simulation model includes non-idealities that are measured from the physical structure. These non-idealities are the shell thickness variation of the roll and roundness errors of the shafts of the roll. These kinds of non-idealities are harmful since they can cause subharmonic resonances of the rotor system. In this case, the natural vibration mode of the rotor is excited when the rotation speed is a fraction of the natural frequency of the system. With the simulation model, the half critical resonance is studied in detail and a sensitivity analysis is performed by simulating several analyses with slightly different input parameters. The model is verified by comparing the simulation results with those obtained by measuring the real structure. Comparison shows that good accuracy is achieved, since equivalent responses are achieved within the error limit of the input parameters.
Resumo:
De récentes découvertes montrent le rôle important que joue l’acide ribonucléique (ARN) au sein des cellules, que ce soit le contrôle de l’expression génétique, la régulation de plusieurs processus homéostasiques, en plus de la transcription et la traduction de l’acide désoxyribonucléique (ADN) en protéine. Si l’on veut comprendre comment la cellule fonctionne, nous devons d’abords comprendre ses composantes et comment ils interagissent, et en particulier chez l’ARN. La fonction d’une molécule est tributaire de sa structure tridimensionnelle (3D). Or, déterminer expérimentalement la structure 3D d’un ARN s’avère fort coûteux. Les méthodes courantes de prédiction par ordinateur de la structure d’un ARN ne tiennent compte que des appariements classiques ou canoniques, similaires à ceux de la fameuse structure en double-hélice de l’ADN. Ici, nous avons amélioré la prédiction de structures d’ARN en tenant compte de tous les types possibles d’appariements, dont ceux dits non-canoniques. Cela est rendu possible dans le contexte d’un nouveau paradigme pour le repliement des ARN, basé sur les motifs cycliques de nucléotides ; des blocs de bases pour la construction des ARN. De plus, nous avons dévelopées de nouvelles métriques pour quantifier la précision des méthodes de prédiction des structures 3D des ARN, vue l’introduction récente de plusieurs de ces méthodes. Enfin, nous avons évalué le pouvoir prédictif des nouvelles techniques de sondage de basse résolution des structures d’ARN.
Resumo:
Convectively coupled equatorial waves are fundamental components of the interaction between the physics and dynamics of the tropical atmosphere. A new methodology, which isolates individual equatorial wave modes, has been developed and applied to observational data. The methodology assumes that the horizontal structures given by equatorial wave theory can be used to project upper- and lower-tropospheric data onto equatorial wave modes. The dynamical fields are first separated into eastward- and westward-moving components with a specified domain of frequency–zonal wavenumber. Each of the components for each field is then projected onto the different equatorial modes using the y structures of these modes given by the theory. The latitudinal scale yo of the modes is predetermined by data to fit the equatorial trapping in a suitable latitude belt y = ±Y. The extent to which the different dynamical fields are consistent with one another in their depiction of each equatorial wave structure determines the confidence in the reality of that structure. Comparison of the analyzed modes with the eastward- and westward-moving components in the convection field enables the identification of the dynamical structure and nature of convectively coupled equatorial waves. In a case study, the methodology is applied to two independent data sources, ECMWF Reanalysis and satellite-observed window brightness temperature (Tb) data for the summer of 1992. Various convectively coupled equatorial Kelvin, mixed Rossby–gravity, and Rossby waves have been detected. The results indicate a robust consistency between the two independent data sources. Different vertical structures for different wave modes and a significant Doppler shifting effect of the background zonal winds on wave structures are found and discussed. It is found that in addition to low-level convergence, anomalous fluxes induced by strong equatorial zonal winds associated with equatorial waves are important for inducing equatorial convection. There is evidence that equatorial convection associated with Rossby waves leads to a change in structure involving a horizontal structure similar to that of a Kelvin wave moving westward with it. The vertical structure may also be radically changed. The analysis method should make a very powerful diagnostic tool for investigating convectively coupled equatorial waves and the interaction of equatorial dynamics and physics in the real atmosphere. The results from application of the analysis method for a reanalysis dataset should provide a benchmark against which model studies can be compared.
Resumo:
The latest version of CATH (class, architecture, topology, homology) (version 3.2), released in July 2008 (http://www.cathdb.info), contains 1 14215 domains, 2178 Homologous superfamilies and 1110 fold groups. We have assigned 20 330 new domains, 87 new homologous superfamilies and 26 new folds since CATH release version 3.1. A total of 28 064 new domains have been assigned since our NAR 2007 database publication (CATH version 3.0). The CATH website has been completely redesigned and includes more comprehensive documentation. We have revisited the CATH architecture level as part of the development of a `Protein Chart` and present information on the population of each architecture. The CATHEDRAL structure comparison algorithm has been improved and used to characterize structural diversity in CATH superfamilies and structural overlaps between superfamilies. Although the majority of superfamilies in CATH are not structurally diverse and do not overlap significantly with other superfamilies, similar to 4% of superfamilies are very diverse and these are the superfamilies that are most highly populated in both the PDB and in the genomes. Information on the degree of structural diversity in each superfamily and structural overlaps between superfamilies can now be downloaded from the CATH website.
Resumo:
ICHNOFOSSILS (PALEO-BURROWS and CROTOVINES) ATTRIBUTED TO EXTINCT MAMMALS IN SOUTHEASTERN and SOUTH BRAZIL. This work presents information regarding tunnels which are attributed to large extinct mammals. These structures can be found in several places in southeastern and southern Brazil, in different types of substrate, occurring as hollow structures (paleo-burrows) or those filled with sediments (crotovines). The dimensions and osteoderm and claw imprints found along the internal walls of the paleo-burrow found on aluvial fan deposits near the town of Cristal (Rio Grande do Sul State) suggest that a dasypodid xenarthran might have dug this structure. Comparison with similar structures found in Argentina can provide more detailed information regarding the paleoecology and biostratigraphy of the organisms that made these burrows.
Resumo:
The discovery of hyperthermophilic microorganisms and the analysis of hyperthermostable enzymes has established the fact that multisubunit enzymes can survive for prolonged periods at temperatures above 100°C. We have carried out homology-based modeling and direct structure comparison on the hexameric glutamate dehydrogenases from the hyperthermophiles Pyrococcus furiosus and Thermococcus litoralis whose optimal growth temperatures are 100°C and 88°C, respectively, to determine key stabilizing features. These enzymes, which are 87% homologous, differ 16-fold in thermal stability at 104°C. We observed that an intersubunit ion-pair network was substantially reduced in the less stable enzyme from T. litoralis, and two residues were then altered to restore these interactions. The single mutations both had adverse effects on the thermostability of the protein. However, with both mutations in place, we observed a fourfold improvement of stability at 104°C over the wild-type enzyme. The catalytic properties of the enzymes were unaffected by the mutations. These results suggest that extensive ion-pair networks may provide a general strategy for manipulating enzyme thermostability of multisubunit enzymes. However, this study emphasizes the importance of the exact local environment of a residue in determining its effects on stability.
Resumo:
In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25 320 structural domains and a further 160 000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153–165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.