13 resultados para computational algebra
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
In this thesis we provide a characterization of probabilistic computation in itself, from a recursion-theoretical perspective, without reducing it to deterministic computation. More specifically, we show that probabilistic computable functions, i.e., those functions which are computed by Probabilistic Turing Machines (PTM), can be characterized by a natural generalization of Kleene's partial recursive functions which includes, among initial functions, one that returns identity or successor with probability 1/2. We then prove the equi-expressivity of the obtained algebra and the class of functions computed by PTMs. In the the second part of the thesis we investigate the relations existing between our recursion-theoretical framework and sub-recursive classes, in the spirit of Implicit Computational Complexity. More precisely, endowing predicative recurrence with a random base function is proved to lead to a characterization of polynomial-time computable probabilistic functions.
Resumo:
The need for a convergence between semi-structured data management and Information Retrieval techniques is manifest to the scientific community. In order to fulfil this growing request, W3C has recently proposed XQuery Full Text, an IR-oriented extension of XQuery. However, the issue of query optimization requires the study of important properties like query equivalence and containment; to this aim, a formal representation of document and queries is needed. The goal of this thesis is to establish such formal background. We define a data model for XML documents and propose an algebra able to represent most of XQuery Full-Text expressions. We show how an XQuery Full-Text expression can be translated into an algebraic expression and how an algebraic expression can be optimized.
Resumo:
Service Oriented Computing is a new programming paradigm for addressing distributed system design issues. Services are autonomous computational entities which can be dynamically discovered and composed in order to form more complex systems able to achieve different kinds of task. E-government, e-business and e-science are some examples of the IT areas where Service Oriented Computing will be exploited in the next years. At present, the most credited Service Oriented Computing technology is that of Web Services, whose specifications are enriched day by day by industrial consortia without following a precise and rigorous approach. This PhD thesis aims, on the one hand, at modelling Service Oriented Computing in a formal way in order to precisely define the main concepts it is based upon and, on the other hand, at defining a new approach, called bipolar approach, for addressing system design issues by synergically exploiting choreography and orchestration languages related by means of a mathematical relation called conformance. Choreography allows us to describe systems of services from a global view point whereas orchestration supplies a means for addressing such an issue from a local perspective. In this work we present SOCK, a process algebra based language inspired by the Web Service orchestration language WS-BPEL which catches the essentials of Service Oriented Computing. From the definition of SOCK we will able to define a general model for dealing with Service Oriented Computing where services and systems of services are related to the design of finite state automata and process algebra concurrent systems, respectively. Furthermore, we introduce a formal language for dealing with choreography. Such a language is equipped with a formal semantics and it forms, together with a subset of the SOCK calculus, the bipolar framework. Finally, we present JOLIE which is a Java implentation of a subset of the SOCK calculus and it is part of the bipolar framework we intend to promote.
Resumo:
Interaction protocols establish how different computational entities can interact with each other. The interaction can be finalized to the exchange of data, as in 'communication protocols', or can be oriented to achieve some result, as in 'application protocols'. Moreover, with the increasing complexity of modern distributed systems, protocols are used also to control such a complexity, and to ensure that the system as a whole evolves with certain features. However, the extensive use of protocols has raised some issues, from the language for specifying them to the several verification aspects. Computational Logic provides models, languages and tools that can be effectively adopted to address such issues: its declarative nature can be exploited for a protocol specification language, while its operational counterpart can be used to reason upon such specifications. In this thesis we propose a proof-theoretic framework, called SCIFF, together with its extensions. SCIFF is based on Abductive Logic Programming, and provides a formal specification language with a clear declarative semantics (based on abduction). The operational counterpart is given by a proof procedure, that allows to reason upon the specifications and to test the conformance of given interactions w.r.t. a defined protocol. Moreover, by suitably adapting the SCIFF Framework, we propose solutions for addressing (1) the protocol properties verification (g-SCIFF Framework), and (2) the a-priori conformance verification of peers w.r.t. the given protocol (AlLoWS Framework). We introduce also an agent based architecture, the SCIFF Agent Platform, where the same protocol specification can be used to program and to ease the implementation task of the interacting peers.
Resumo:
Ion channels are protein molecules, embedded in the lipid bilayer of the cell membranes. They act as powerful sensing elements switching chemicalphysical stimuli into ion-fluxes. At a glance, ion channels are water-filled pores, which can open and close in response to different stimuli (gating), and one once open select the permeating ion species (selectivity). They play a crucial role in several physiological functions, like nerve transmission, muscular contraction, and secretion. Besides, ion channels can be used in technological applications for different purpose (sensing of organic molecules, DNA sequencing). As a result, there is remarkable interest in understanding the molecular determinants of the channel functioning. Nowadays, both the functional and the structural characteristics of ion channels can be experimentally solved. The purpose of this thesis was to investigate the structure-function relation in ion channels, by computational techniques. Most of the analyses focused on the mechanisms of ion conduction, and the numerical methodologies to compute the channel conductance. The standard techniques for atomistic simulation of complex molecular systems (Molecular Dynamics) cannot be routinely used to calculate ion fluxes in membrane channels, because of the high computational resources needed. The main step forward of the PhD research activity was the development of a computational algorithm for the calculation of ion fluxes in protein channels. The algorithm - based on the electrodiffusion theory - is computational inexpensive, and was used for an extensive analysis on the molecular determinants of the channel conductance. The first record of ion-fluxes through a single protein channel dates back to 1976, and since then measuring the single channel conductance has become a standard experimental procedure. Chapter 1 introduces ion channels, and the experimental techniques used to measure the channel currents. The abundance of functional data (channel currents) does not match with an equal abundance of structural data. The bacterial potassium channel KcsA was the first selective ion channels to be experimentally solved (1998), and after KcsA the structures of four different potassium channels were revealed. These experimental data inspired a new era in ion channel modeling. Once the atomic structures of channels are known, it is possible to define mathematical models based on physical descriptions of the molecular systems. These physically based models can provide an atomic description of ion channel functioning, and predict the effect of structural changes. Chapter 2 introduces the computation methods used throughout the thesis to model ion channels functioning at the atomic level. In Chapter 3 and Chapter 4 the ion conduction through potassium channels is analyzed, by an approach based on the Poisson-Nernst-Planck electrodiffusion theory. In the electrodiffusion theory ion conduction is modeled by the drift-diffusion equations, thus describing the ion distributions by continuum functions. The numerical solver of the Poisson- Nernst-Planck equations was tested in the KcsA potassium channel (Chapter 3), and then used to analyze how the atomic structure of the intracellular vestibule of potassium channels affects the conductance (Chapter 4). As a major result, a correlation between the channel conductance and the potassium concentration in the intracellular vestibule emerged. The atomic structure of the channel modulates the potassium concentration in the vestibule, thus its conductance. This mechanism explains the phenotype of the BK potassium channels, a sub-family of potassium channels with high single channel conductance. The functional role of the intracellular vestibule is also the subject of Chapter 5, where the affinity of the potassium channels hEag1 (involved in tumour-cell proliferation) and hErg (important in the cardiac cycle) for several pharmaceutical drugs was compared. Both experimental measurements and molecular modeling were used in order to identify differences in the blocking mechanism of the two channels, which could be exploited in the synthesis of selective blockers. The experimental data pointed out the different role of residue mutations in the blockage of hEag1 and hErg, and the molecular modeling provided a possible explanation based on different binding sites in the intracellular vestibule. Modeling ion channels at the molecular levels relates the functioning of a channel to its atomic structure (Chapters 3-5), and can also be useful to predict the structure of ion channels (Chapter 6-7). In Chapter 6 the structure of the KcsA potassium channel depleted from potassium ions is analyzed by molecular dynamics simulations. Recently, a surprisingly high osmotic permeability of the KcsA channel was experimentally measured. All the available crystallographic structure of KcsA refers to a channel occupied by potassium ions. To conduct water molecules potassium ions must be expelled from KcsA. The structure of the potassium-depleted KcsA channel and the mechanism of water permeation are still unknown, and have been investigated by numerical simulations. Molecular dynamics of KcsA identified a possible atomic structure of the potassium-depleted KcsA channel, and a mechanism for water permeation. The depletion from potassium ions is an extreme situation for potassium channels, unlikely in physiological conditions. However, the simulation of such an extreme condition could help to identify the structural conformations, so the functional states, accessible to potassium ion channels. The last chapter of the thesis deals with the atomic structure of the !- Hemolysin channel. !-Hemolysin is the major determinant of the Staphylococcus Aureus toxicity, and is also the prototype channel for a possible usage in technological applications. The atomic structure of !- Hemolysin was revealed by X-Ray crystallography, but several experimental evidences suggest the presence of an alternative atomic structure. This alternative structure was predicted, combining experimental measurements of single channel currents and numerical simulations. This thesis is organized in two parts, in the first part an overview on ion channels and on the numerical methods adopted throughout the thesis is provided, while the second part describes the research projects tackled in the course of the PhD programme. The aim of the research activity was to relate the functional characteristics of ion channels to their atomic structure. In presenting the different research projects, the role of numerical simulations to analyze the structure-function relation in ion channels is highlighted.
Resumo:
Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.