5 resultados para Tri-dimensional structure

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study a new, fully non-linear, approach to Local Earthquake Tomography is presented. Local Earthquakes Tomography (LET) is a non-linear inversion problem that allows the joint determination of earthquakes parameters and velocity structure from arrival times of waves generated by local sources. Since the early developments of seismic tomography several inversion methods have been developed to solve this problem in a linearized way. In the framework of Monte Carlo sampling, we developed a new code based on the Reversible Jump Markov Chain Monte Carlo sampling method (Rj-McMc). It is a trans-dimensional approach in which the number of unknowns, and thus the model parameterization, is treated as one of the unknowns. I show that our new code allows overcoming major limitations of linearized tomography, opening a new perspective in seismic imaging. Synthetic tests demonstrate that our algorithm is able to produce a robust and reliable tomography without the need to make subjective a-priori assumptions about starting models and parameterization. Moreover it provides a more accurate estimate of uncertainties about the model parameters. Therefore, it is very suitable for investigating the velocity structure in regions that lack of accurate a-priori information. Synthetic tests also reveal that the lack of any regularization constraints allows extracting more information from the observed data and that the velocity structure can be detected also in regions where the density of rays is low and standard linearized codes fails. I also present high-resolution Vp and Vp/Vs models in two widespread investigated regions: the Parkfield segment of the San Andreas Fault (California, USA) and the area around the Alto Tiberina fault (Umbria-Marche, Italy). In both the cases, the models obtained with our code show a substantial improvement in the data fit, if compared with the models obtained from the same data set with the linearized inversion codes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, a strategy to model the behavior of fluids and their interaction with deformable bodies is proposed. The fluid domain is modeled by using the lattice Boltzmann method, thus analyzing the fluid dynamics by a mesoscopic point of view. It has been proved that the solution provided by this method is equivalent to solve the Navier-Stokes equations for an incompressible flow with a second-order accuracy. Slender elastic structures idealized through beam finite elements are used. Large displacements are accounted for by using the corotational formulation. Structural dynamics is computed by using the Time Discontinuous Galerkin method. Therefore, two different solution procedures are used, one for the fluid domain and the other for the structural part, respectively. These two solvers need to communicate and to transfer each other several information, i.e. stresses, velocities, displacements. In order to guarantee a continuous, effective, and mutual exchange of information, a coupling strategy, consisting of three different algorithms, has been developed and numerically tested. In particular, the effectiveness of the three algorithms is shown in terms of interface energy artificially produced by the approximate fulfilling of compatibility and equilibrium conditions at the fluid-structure interface. The proposed coupled approach is used in order to solve different fluid-structure interaction problems, i.e. cantilever beams immersed in a viscous fluid, the impact of the hull of the ship on the marine free-surface, blood flow in a deformable vessels, and even flapping wings simulating the take-off of a butterfly. The good results achieved in each application highlight the effectiveness of the proposed methodology and of the C++ developed software to successfully approach several two-dimensional fluid-structure interaction problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Finite element techniques for solving the problem of fluid-structure interaction of an elastic solid material in a laminar incompressible viscous flow are described. The mathematical problem consists of the Navier-Stokes equations in the Arbitrary Lagrangian-Eulerian formulation coupled with a non-linear structure model, considering the problem as one continuum. The coupling between the structure and the fluid is enforced inside a monolithic framework which computes simultaneously for the fluid and the structure unknowns within a unique solver. We used the well-known Crouzeix-Raviart finite element pair for discretization in space and the method of lines for discretization in time. A stability result using the Backward-Euler time-stepping scheme for both fluid and solid part and the finite element method for the space discretization has been proved. The resulting linear system has been solved by multilevel domain decomposition techniques. Our strategy is to solve several local subproblems over subdomain patches using the Schur-complement or GMRES smoother within a multigrid iterative solver. For validation and evaluation of the accuracy of the proposed methodology, we present corresponding results for a set of two FSI benchmark configurations which describe the self-induced elastic deformation of a beam attached to a cylinder in a laminar channel flow, allowing stationary as well as periodically oscillating deformations, and for a benchmark proposed by COMSOL multiphysics where a narrow vertical structure attached to the bottom wall of a channel bends under the force due to both viscous drag and pressure. Then, as an example of fluid-structure interaction in biomedical problems, we considered the academic numerical test which consists in simulating the pressure wave propagation through a straight compliant vessel. All the tests show the applicability and the numerical efficiency of our approach to both two-dimensional and three-dimensional problems.