1000 resultados para DNA Computing
Resumo:
La computación molecular es una disciplina que se ocupa del diseño e implementación de dispositivos para el procesamiento de información sobre un sustrato biológico, como el ácido desoxirribonucleico (ADN), el ácido ribonucleico (ARN) o las proteínas. Desde que Watson y Crick descubrieron en los años cincuenta la estructura molecular del ADN en forma de doble hélice, se desencadenaron otros descubrimientos, como las enzimas de restricción o la reacción en cadena de la polimerasa (PCR), contribuyendo de manera determinante a la irrupción de la tecnología del ADN recombinante. Gracias a esta tecnología y al descenso vertiginoso de los precios de secuenciación y síntesis del ADN, la computación biomolecular pudo abandonar su concepción puramente teórica. El trabajo presentado por Adleman (1994) logró resolver un problema de computación NP-completo (El Problema del Camino de Hamilton dirigido) utilizando únicamente moléculas de ADN. La gran capacidad de procesamiento en paralelo ofrecida por las técnicas del ADN recombinante permitió a Adleman ser capaz de resolver dicho problema en tiempo polinómico, aunque a costa de un consumo exponencial de moléculas de ADN. Utilizando algoritmos de fuerza bruta similares al utilizado por Adleman se logró resolver otros problemas NP-completos, como por ejemplo el de Satisfacibilidad de Fórmulas Lógicas / SAT (Lipton, 1995). Pronto se comprendió que la computación biomolecular no podía competir en velocidad ni precisión con los ordenadores de silicio, por lo que su enfoque y objetivos se centraron en la resolución de problemas con aplicación biomédica (Simmel, 2007), dejando de lado la resolución de problemas clásicos de computación. Desde entonces se han propuesto diversos modelos de dispositivos biomoleculares que, de forma autónoma (sin necesidad de un bio-ingeniero realizando operaciones de laboratorio), son capaces de procesar como entrada un sustrato biológico y proporcionar una salida también en formato biológico: procesadores que aprovechan la extensión de la polimerasa (Hagiya et al., 1997), autómatas que funcionan con enzimas de restricción (Benenson et al., 2001) o con deoxiribozimas (Stojanovic et al., 2002), o circuitos de hibridación competitiva (Yurke et al., 2000). Esta tesis presenta un conjunto de modelos de dispositivos de ácidos nucleicos capaces de implementar diversas operaciones de computación lógica aprovechando técnicas de computación biomolecular (hibridación competitiva del ADN y reacciones enzimáticas) con aplicaciones en diagnóstico genético. El primer conjunto de modelos, presentados en el Capítulo 5 y publicados en Sainz de Murieta and Rodríguez-Patón (2012b), Rodríguez-Patón et al. (2010a) y Sainz de Murieta and Rodríguez-Patón (2010), define un tipo de biosensor que usa hebras simples de ADN para codificar reglas sencillas, como por ejemplo "SI hebra-ADN-1 Y hebra-ADN-2 presentes, ENTONCES enfermedad-B". Estas reglas interactúan con señales de entrada (ADN o ARN de cualquier tipo) para producir una señal de salida (también en forma de ácido nucleico). Dicha señal de salida representa un diagnóstico, que puede medirse mediante partículas fluorescentes técnicas FRET) o incluso ser un tratamiento administrado en respuesta a un conjunto de síntomas. El modelo presentado en el Capítulo 5, publicado en Rodríguez-Patón et al. (2011), es capaz de ejecutar cadenas de resolución sobre fórmulas lógicas en forma normal conjuntiva. Cada cláusula de una fórmula se codifica en una molécula de ADN. Cada proposición p se codifica asignándole una hebra simple de ADN, y la correspondiente hebra complementaria a la proposición ¬p. Las cláusulas se codifican incluyendo distintas proposiciones en la misma hebra de ADN. El modelo permite ejecutar programas lógicos de cláusulas Horn aplicando múltiples iteraciones de resolución en cascada, con el fin de implementar la función de un nanodispositivo autónomo programable. Esta técnica también puede emplearse para resolver SAP sin ayuda externa. El modelo presentado en el Capítulo 6 se ha publicado en publicado en Sainz de Murieta and Rodríguez-Patón (2012c), y el modelo presentado en el Capítulo 7 se ha publicado en (Sainz de Murieta and Rodríguez-Patón, 2013c). Aunque explotan métodos de computación biomolecular diferentes (hibridación competitiva de ADN en el Capítulo 6 frente a reacciones enzimáticas en el 7), ambos modelos son capaces de realizar inferencia Bayesiana. Funcionan tomando hebras simples de ADN como entrada, representando la presencia o la ausencia de un indicador molecular concreto (una evidencia). La probabilidad a priori de una enfermedad, así como la probabilidad condicionada de una señal (o síntoma) dada la enfermedad representan la base de conocimiento, y se codifican combinando distintas moléculas de ADN y sus concentraciones relativas. Cuando las moléculas de entrada interaccionan con las de la base de conocimiento, se liberan dos clases de hebras de ADN, cuya proporción relativa representa la aplicación del teorema de Bayes: la probabilidad condicionada de la enfermedad dada la señal (o síntoma). Todos estos dispositivos pueden verse como elementos básicos que, combinados modularmente, permiten la implementación de sistemas in vitro a partir de sensores de ADN, capaces de percibir y procesar señales biológicas. Este tipo de autómatas tienen en la actualidad una gran potencial, además de una gran repercusión científica. Un perfecto ejemplo fue la publicación de (Xie et al., 2011) en Science, presentando un autómata biomolecular de diagnóstico capaz de activar selectivamente el proceso de apoptosis en células cancerígenas sin afectar a células sanas.
Resumo:
We present a computing model based on the DNA strand displacement technique which performs Bayesian inference. The model will take single stranded DNA as input data, representing the presence or absence of a specific molecular signal (evidence). The program logic encodes the prior probability of a disease and the conditional probability of a signal given the disease playing with a set of different DNA complexes and their ratios. When the input and program molecules interact, they release a different pair of single stranded DNA species whose relative proportion represents the application of Bayes? Law: the conditional probability of the disease given the signal. The models presented in this paper can empower the application of probabilistic reasoning in genetic diagnosis in vitro.
Resumo:
In this paper it is explained how to solve a fully connected N-City travelling salesman problem (TSP) using a genetic algorithm. A crossover operator to use in the simulation of a genetic algorithm (GA) with DNA is presented. The aim of the paper is to follow the path of creating a new computational model based on DNA molecules and genetic operations. This paper solves the problem of exponentially size algorithms in DNA computing by using biological methods and techniques. After individual encoding and fitness evaluation, a protocol of the next step in a GA, crossover, is needed. This paper also shows how to make the GA faster via different populations of possible solutions.
Resumo:
* This work has been partially supported by Spanish Project TIC2003-9319-c03-03 “Neural Networks and Networks of Evolutionary Processors”.
Resumo:
In this paper we propose a model of encoding data into DNA strands so that this data can be used in the simulation of a genetic algorithm based on molecular operations. DNA computing is an impressive computational model that needs algorithms to work properly and efficiently. The first problem when trying to apply an algorithm in DNA computing must be how to codify the data that the algorithm will use. In a genetic algorithm the first objective must be to codify the genes, which are the main data. A concrete encoding of the genes in a single DNA strand is presented and we discuss what this codification is suitable for. Previous work on DNA coding defined bond-free languages which several properties assuring the stability of any DNA word of such a language. We prove that a bond-free language is necessary but not sufficient to codify a gene giving the correct codification.
Resumo:
Variations in different types of genomes have been found to be responsible for a large degree of physical diversity such as appearance and susceptibility to disease. Identification of genomic variations is difficult and can be facilitated through computational analysis of DNA sequences. Newly available technologies are able to sequence billions of DNA base pairs relatively quickly. These sequences can be used to identify variations within their specific genome but must be mapped to a reference sequence first. In order to align these sequences to a reference sequence, we require mapping algorithms that make use of approximate string matching and string indexing methods. To date, few mapping algorithms have been tailored to handle the massive amounts of output generated by newly available sequencing technologies. In otrder to handle this large amount of data, we modified the popular mapping software BWA to run in parallel using OpenMPI. Parallel BWA matches the efficiency of multithreaded BWA functions while providing efficient parallelism for BWA functions that do not currently support multithreading. Parallel BWA shows significant wall time speedup in comparison to multithreaded BWA on high-performance computing clusters, and will thus facilitate the analysis of genome sequencing data.
Resumo:
The physics of plasmas encompasses basic problems from the universe and has assured us of promises in diverse applications to be implemented in a wider range of scientific and engineering domains, linked to most of the evolved and evolving fundamental problems. Substantial part of this domain could be described by R–D mechanisms involving two or more species (reaction–diffusion mechanisms). These could further account for the simultaneous non-linear effects of heating, diffusion and other related losses. We mention here that in laboratory scale experiments, a suitable combination of these processes is of vital importance and very much decisive to investigate and compute the net behaviour of plasmas under consideration. Plasmas are being used in the revolution of information processing, so we considered in this technical note a simple framework to discuss and pave the way for better formalisms and Informatics, dealing with diverse domains of science and technologies. The challenging and fascinating aspects of plasma physics is that it requires a great deal of insight in formulating the relevant design problems, which in turn require ingenuity and flexibility in choosing a particular set of mathematical (and/or experimental) tools to implement them.
Resumo:
The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.
Resumo:
The design of a large and reliable DNA codeword library is a key problem in DNA based computing. DNA codes, namely sets of fixed length edit metric codewords over the alphabet {A, C, G, T}, satisfy certain combinatorial constraints with respect to biological and chemical restrictions of DNA strands. The primary constraints that we consider are the reverse--complement constraint and the fixed GC--content constraint, as well as the basic edit distance constraint between codewords. We focus on exploring the theory underlying DNA codes and discuss several approaches to searching for optimal DNA codes. We use Conway's lexicode algorithm and an exhaustive search algorithm to produce provably optimal DNA codes for codes with small parameter values. And a genetic algorithm is proposed to search for some sub--optimal DNA codes with relatively large parameter values, where we can consider their sizes as reasonable lower bounds of DNA codes. Furthermore, we provide tables of bounds on sizes of DNA codes with length from 1 to 9 and minimum distance from 1 to 9.
Resumo:
The DNA G-qadruplexes are one of the targets being actively explored for anti-cancer therapy by inhibiting them through small molecules. This computational study was conducted to predict the binding strengths and orientations of a set of novel dimethyl-amino-ethyl-acridine (DACA) analogues that are designed and synthesized in our laboratory, but did not diffract in Synchrotron light.Thecrystal structure of DNA G-Quadruplex(TGGGGT)4(PDB: 1O0K) was used as target for their binding properties in our studies.We used both the force field (FF) and QM/MM derived atomic charge schemes simultaneously for comparing the predictions of drug binding modes and their energetics. This study evaluates the comparative performance of fixed point charge based Glide XP docking and the quantum polarized ligand docking schemes. These results will provide insights on the effects of including or ignoring the drug-receptor interfacial polarization events in molecular docking simulations, which in turn, will aid the rational selection of computational methods at different levels of theory in future drug design programs. Plenty of molecular modelling tools and methods currently exist for modelling drug-receptor or protein-protein, or DNA-protein interactionssat different levels of complexities.Yet, the capasity of such tools to describevarious physico-chemical propertiesmore accuratelyis the next step ahead in currentresearch.Especially, the usage of most accurate methods in quantum mechanics(QM) is severely restricted by theirtedious nature. Though the usage of massively parallel super computing environments resulted in a tremendous improvement in molecular mechanics (MM) calculations like molecular dynamics,they are still capable of dealing with only a couple of tens to hundreds of atoms for QM methods. One such efficient strategy that utilizes thepowers of both MM and QM are the QM/MM hybrid methods. Lately, attempts have been directed towards the goal of deploying several different QM methods for betterment of force field based simulations, but with practical restrictions in place. One of such methods utilizes the inclusion of charge polarization events at the drug-receptor interface, that is not explicitly present in the MM FF.
Resumo:
Bio-molecular computing, 'computations performed by bio-molecules', is already challenging traditional approaches to computation both theoretically and technologically. Often placed within the wider context of ´bio-inspired' or 'natural' or even 'unconventional' computing, the study of natural and artificial molecular computations is adding to our understanding of biology, physical sciences and computer science well beyond the framework of existing design and implementation paradigms. In this introduction, We wish to outline the current scope of the field and assemble some basic arguments that, bio-molecular computation is of central importance to computer science, physical sciences and biology using HOL - Higher Order Logic. HOL is used as the computational tool in our R&D work. DNA was analyzed as a chemical computing engine, in our effort to develop novel formalisms to understand the molecular scale bio-chemical computing behavior using HOL. In our view, our focus is one of the pioneering efforts in this promising domain of nano-bio scale chemical information processing dynamics.
Resumo:
Usually we observe that Bio-physical systems or Bio-chemical systems are many a time based on nanoscale phenomenon in different host environments, which involve many particles can often not be solved explicitly. Instead a physicist, biologist or a chemist has to rely either on approximate or numerical methods. For a certain type of systems, called integrable in nature, there exist particular mathematical structures and symmetries which facilitate the exact and explicit description. Most integrable systems, we come across are low-dimensional, for instance, a one-dimensional chain of coupled atoms in DNA molecular system with a particular direction or exist as a vector in the environment. This theoretical research paper aims at bringing one of the pioneering ‘Reaction-Diffusion’ aspects of the DNA-plasma material system based on an integrable lattice model approach utilizing quantized functional algebras, to disseminate the new developments, initiate novel computational and design paradigms.
Resumo:
This article gives an overview over the methods used in the low--level analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of nucleic acid abundance in a set of tissues or cell populations for thousands of transcripts or loci simultaneously. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. This includes the design of probes, the experimental design, the image analysis of microarray scanned images, the normalization of fluorescence intensities, the assessment of the quality of microarray data and incorporation of quality information in subsequent analyses, the combination of information across arrays and across sets of experiments, the discovery and recognition of patterns in expression at the single gene and multiple gene levels, and the assessment of significance of these findings, considering the fact that there is a lot of noise and thus random features in the data. For all of these components, access to a flexible and efficient statistical computing environment is an essential aspect.
Resumo:
Small clusters of gallium oxide, technologically important high temperature ceramic, together with interaction of nucleic acid bases with graphene and small-diameter carbon nanotube are focus of first principles calculations in this work. A high performance parallel computing platform is also developed to perform these calculations at Michigan Tech. First principles calculations are based on density functional theory employing either local density or gradient-corrected approximation together with plane wave and gaussian basis sets. The bulk Ga2O3 is known to be a very good candidate for fabricating electronic devices that operate at high temperatures. To explore the properties of Ga2O3 at nonoscale, we have performed a systematic theoretical study on the small polyatomic gallium oxide clusters. The calculated results find that all lowest energy isomers of GamOn clusters are dominated by the Ga-O bonds over the metal-metal or the oxygen-oxygen bonds. Analysis of atomic charges suggest the clusters to be highly ionic similar to the case of bulk Ga2O3. In the study of sequential oxidation of these slusters starting from Ga2O, it is found that the most stable isomers display up to four different backbones of constituent atoms. Furthermore, the predicted configuration of the ground state of Ga2O is recently confirmed by the experimental result of Neumark's group. Guided by the results of calculations the study of gallium oxide clusters, performance related challenge of computational simulations, of producing high performance computers/platforms, has been addressed. Several engineering aspects were thoroughly studied during the design, development and implementation of the high performance parallel computing platform, rama, at Michigan Tech. In an attempt to stay true to the principles of Beowulf revolutioni, the rama cluster was extensively customized to make it easy to understand, and use - for administrators as well as end-users. Following the results of benchmark calculations and to keep up with the complexity of systems under study, rama has been expanded to a total of sixty four processors. Interest in the non-covalent intereaction of DNA with carbon nanotubes has steadily increased during past several years. This hybrid system, at the junction of the biological regime and the nanomaterials world, possesses features which make it very attractive for a wide range of applicatioins. Using the in-house computational power available, we have studied details of the interaction between nucleic acid bases with graphene sheet as well as high-curvature small-diameter carbon nanotube. The calculated trend in the binding energies strongly suggests that the polarizability of the base molecules determines the interaction strength of the nucleic acid bases with graphene. When comparing the results obtained here for physisorption on the small diameter nanotube considered with those from the study on graphene, it is observed that the interaction strength of nucleic acid bases is smaller for the tube. Thus, these results show that the effect of introducing curvature is to reduce the binding energy. The binding energies for the two extreme cases of negligible curvature (i.e. flat graphene sheet) and of very high curvature (i.e. small diameter nanotube) may be considered as upper and lower bounds. This finding represents an important step towards a better understanding of experimentally observed sequence-dependent interaction of DNA with Carbon nanotubes.
Resumo:
(1) A mathematical theory for computing the probabilities of various nucleotide configurations is developed, and the probability of obtaining the correct phylogenetic tree (model tree) from sequence data is evaluated for six phylogenetic tree-making methods (UPGMA, distance Wagner method, transformed distance method, Fitch-Margoliash's method, maximum parsimony method, and compatibility method). The number of nucleotides (m*) necessary to obtain the correct tree with a probability of 95% is estimated with special reference to the human, chimpanzee, and gorilla divergence. m* is at least 4,200, but the availability of outgroup species greatly reduces m* for all methods except UPGMA. m* increases if transitions occur more frequently than transversions as in the case of mitochondrial DNA. (2) A new tree-making method called the neighbor-joining method is proposed. This method is applicable either for distance data or character state data. Computer simulation has shown that the neighbor-joining method is generally better than UPGMA, Farris' method, Li's method, and modified Farris method on recovering the true topology when distance data are used. A related method, the simultaneous partitioning method, is also discussed. (3) The maximum likelihood (ML) method for phylogeny reconstruction under the assumption of both constant and varying evolutionary rates is studied, and a new algorithm for obtaining the ML tree is presented. This method gives a tree similar to that obtained by UPGMA when constant evolutionary rate is assumed, whereas it gives a tree similar to that obtained by the maximum parsimony tree and the neighbor-joining method when varying evolutionary rate is assumed. ^