995 resultados para Genetic representations
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances.
Resumo:
Analysis of the genetic changes in human tumors is often problematical because of the presence of normal stroma and the limited availability of pure tumor DNA. However, large amounts of highly reproducible “representations” of tumor and normal genomes can be made by PCR from nanogram amounts of restriction endonuclease cleaved DNA that has been ligated to oligonucleotide adaptors. We show here that representations are useful for many types of genetic analyses, including measuring relative gene copy number, loss of heterozygosity, and comparative genomic hybridization. Representations may be prepared even from sorted nuclei from fixed and archived tumor biopsies.
Resumo:
Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Resumo:
The noted 19th century biologist, Ernst Haeckel, put forward the idea that the growth (ontogenesis) of an organism recapitulated the history of its evolutionary development. While this idea is defunct within biology, the idea has been promoted in areas such as education (the idea of an education being the repetition of the civilizations before). In the research presented in this paper, recapitulation is used as a metaphor within computer-aided design as a way of grouping together different generations of spatial layouts. In most CAD programs, a spatial layout is represented as a series of objects (lines, or boundary representations) that stand in as walls. The relationships between spaces are not usually explicitly stated. A representation using Lindenmayer Systems (originally designed for the purpose of modelling plant morphology) is put forward as a way of representing the morphology of a spatial layout. The aim of this research is not just to describe an individual layout, but to find representations that link together lineages of development. This representation can be used in generative design as a way of creating more meaningful layouts which have particular characteristics. The use of genetic operators (mutation and crossover) is also considered, making this representation suitable for use with genetic algorithms.
Resumo:
Reducing the energy consumption of water distribution networks has never had more significance. The greatest energy savings can be obtained by carefully scheduling the operations of pumps. Schedules can be defined either implicitly, in terms of other elements of the network such as tank levels, or explicitly by specifying the time during which each pump is on/off. The traditional representation of explicit schedules is a string of binary values with each bit representing pump on/off status during a particular time interval. In this paper, we formally define and analyze two new explicit representations based on time-controlled triggers, where the maximum number of pump switches is established beforehand and the schedule may contain less switches than the maximum. In these representations, a pump schedule is divided into a series of integers with each integer representing the number of hours for which a pump is active/inactive. This reduces the number of potential schedules compared to the binary representation, and allows the algorithm to operate on the feasible region of the search space. We propose evolutionary operators for these two new representations. The new representations and their corresponding operations are compared with the two most-used representations in pump scheduling, namely, binary representation and level-controlled triggers. A detailed statistical analysis of the results indicates which parameters have the greatest effect on the performance of evolutionary algorithms. The empirical results show that an evolutionary algorithm using the proposed representations improves over the results obtained by a recent state-of-the-art Hybrid Genetic Algorithm for pump scheduling using level-controlled triggers.
Resumo:
A well-cited paper suggesting fuzzy coding as an alternative to the conventional binary, grey and floating-point representations used in genetic algorithms.
Resumo:
An experimental study aimed at assessing the influence of redundancy and neutrality on the performance of an (1+1)-ES evolution strategy modeled using Markov chains and applied to NK fitness landscapes is presented. For the study, two families of redundant binary representations, one non-neutral family which is based on linear transformations and that allows the phenotypic neighborhoods to be designed in a simple and effective way, and the neutral family based on the mathematical formulation of error control codes are used. The results indicate whether redundancy or neutrality affects more strongly the behavior of the algorithm used.
Resumo:
Hub location problem is an NP-hard problem that frequently arises in the design of transportation and distribution systems, postal delivery networks, and airline passenger flow. This work focuses on the Single Allocation Hub Location Problem (SAHLP). Genetic Algorithms (GAs) for the capacitated and uncapacitated variants of the SAHLP based on new chromosome representations and crossover operators are explored. The GAs is tested on two well-known sets of real-world problems with up to 200 nodes. The obtained results are very promising. For most of the test problems the GA obtains improved or best-known solutions and the computational time remains low. The proposed GAs can easily be extended to other variants of location problems arising in network design planning in transportation systems.
Resumo:
As the complexity of evolutionary design problems grow, so too must the quality of solutions scale to that complexity. In this research, we develop a genetic programming system with individuals encoded as tree-based generative representations to address scalability. This system is capable of multi-objective evaluation using a ranked sum scoring strategy. We examine Hornby's features and measures of modularity, reuse and hierarchy in evolutionary design problems. Experiments are carried out, using the system to generate three-dimensional forms, and analyses of feature characteristics such as modularity, reuse and hierarchy were performed. This work expands on that of Hornby's, by examining a new and more difficult problem domain. The results from these experiments show that individuals encoded with those three features performed best overall. It is also seen, that the measures of complexity conform to the results of Hornby. Moving forward with only this best performing encoding, the system was applied to the generation of three-dimensional external building architecture. One objective considered was passive solar performance, in which the system was challenged with generating forms that optimize exposure to the Sun. The results from these and other experiments satisfied the requirements. The system was shown to scale well to the architectural problems studied.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
We investigate the possibility of interpreting the degeneracy of the genetic code, i.e., the feature that different codons (base triplets) of DNA are transcribed into the same amino acid, as the result of a symmetry breaking process, in the context of finite groups. In the first part of this paper, we give the complete list of all codon representations (64-dimensional irreducible representations) of simple finite groups and their satellites (central extensions and extensions by outer automorphisms). In the second part, we analyze the branching rules for the codon representations found in the first part by computational methods, using a software package for computational group theory. The final result is a complete classification of the possible schemes, based on finite simple groups, that reproduce the multiplet structure of the genetic code. (C) 2010 Elsevier Ltd. All rights reserved.
The Zebrafish Information Network (ZFIN): a resource for genetic, genomic and developmental research
Resumo:
The Zebrafish Information Network, ZFIN, is a WWW community resource of zebrafish genetic, genomic and developmental research information (http://zfin.org). ZFIN provides an anatomical atlas and dictionary, developmental staging criteria, research methods, pathology information and a link to the ZFIN relational database (http://zfin.org/ZFIN/). The database, built on a relational, object-oriented model, provides integrated information about mutants, genes, genetic markers, mapping panels, publications and contact information for the zebrafish research community. The database is populated with curated published data, user submitted data and large dataset uploads. A broad range of data types including text, images, graphical representations and genetic maps supports the data. ZFIN incorporates links to other genomic resources that provide sequence and ortholog data. Zebrafish nomenclature guidelines and an automated registration mechanism for new names are provided. Extensive usability testing has resulted in an easy to learn and use forms interface with complex searching capabilities.
Resumo:
The ability to carry out high-resolution genetic mapping at high throughput in the mouse is a critical rate-limiting step in the generation of genetically anchored contigs in physical mapping projects and the mapping of genetic loci for complex traits. To address this need, we have developed an efficient, high-resolution, large-scale genome mapping system. This system is based on the identification of polymorphic DNA sites between mouse strains by using interspersed repetitive sequence (IRS) PCR. Individual cloned IRS PCR products are hybridized to a DNA array of IRS PCR products derived from the DNA of individual mice segregating DNA sequences from the two parent strains. Since gel electrophoresis is not required, large numbers of samples can be genotyped in parallel. By using this approach, we have mapped > 450 polymorphic probes with filters containing the DNA of up to 517 backcross mice, potentially allowing resolution of 0.14 centimorgan. This approach also carries the potential for a high degree of efficiency in the integration of physical and genetic maps, since pooled DNAs representing libraries of yeast artificial chromosomes or other physical representations of the mouse genome can be addressed by hybridization of filter representations of the IRS PCR products of such libraries.
Resumo:
Predictive testing is one of the new genetic technologies which, in conjunction with developing fields such as pharmacogenomics, promises many benefits for preventive and population health. Understanding how individuals appraise and make genetic test decisions is increasingly relevant as the technology expands. Lay understandings of genetic risk and test decision-making, located within holistic life frameworks including family or kin relationships, may vary considerably from clinical representations of these phenomena. The predictive test for Huntington's disease (HD), whilst specific to a single-gene, serious, mature-onset but currently untreatable disorder, is regarded as a model in this context. This paper reports upon a qualitative Australian study which investigated predictive test decision-making by individuals at risk for HD, the contexts of their decisions and the appraisals which underpinned them. In-depth interviews were conducted in Australia with 16 individuals at 50% risk for HD, with variation across testing decisions, gender, age and selected characteristics. Findings suggested predictive testing was regarded as a significant life decision with important implications for self and others, while the right not to know genetic status was staunchly and unanimously defended. Multiple contexts of reference were identified within which test decisions were located, including intra- and inter-personal frameworks, family history and experience of HID, and temporality. Participants used two main criteria in appraising test options: perceived value of, or need for the test information, for self and/or significant others, and degree to which such information could be tolerated and managed, short and long-term, by self and/or others. Selected moral and ethical considerations involved in decision-making are examined, as well as the clinical and socio-political contexts in which predictive testing is located. The paper argues that psychosocial vulnerabilities generated by the availability of testing technologies and exacerbated by policy imperatives towards individual responsibility and self-governance should be addressed at broader societal levels. (C) 2003 Elsevier Science Ltd. All rights reserved.