16 resultados para Non-dominated sorting genetic algorithms

em Brock University, Canada


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hub location problem is an NP-hard problem that frequently arises in the design of transportation and distribution systems, postal delivery networks, and airline passenger flow. This work focuses on the Single Allocation Hub Location Problem (SAHLP). Genetic Algorithms (GAs) for the capacitated and uncapacitated variants of the SAHLP based on new chromosome representations and crossover operators are explored. The GAs is tested on two well-known sets of real-world problems with up to 200 nodes. The obtained results are very promising. For most of the test problems the GA obtains improved or best-known solutions and the computational time remains low. The proposed GAs can easily be extended to other variants of location problems arising in network design planning in transportation systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hub Location Problems play vital economic roles in transportation and telecommunication networks where goods or people must be efficiently transferred from an origin to a destination point whilst direct origin-destination links are impractical. This work investigates the single allocation hub location problem, and proposes a genetic algorithm (GA) approach for it. The effectiveness of using a single-objective criterion measure for the problem is first explored. Next, a multi-objective GA employing various fitness evaluation strategies such as Pareto ranking, sum of ranks, and weighted sum strategies is presented. The effectiveness of the multi-objective GA is shown by comparison with an Integer Programming strategy, the only other multi-objective approach found in the literature for this problem. Lastly, two new crossover operators are proposed and an empirical study is done using small to large problem instances of the Civil Aeronautics Board (CAB) and Australian Post (AP) data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the relationship between genetic diseases and the genes associated with them is an important problem regarding human health. The vast amount of data created from a large number of high-throughput experiments performed in the last few years has resulted in an unprecedented growth in computational methods to tackle the disease gene association problem. Nowadays, it is clear that a genetic disease is not a consequence of a defect in a single gene. Instead, the disease phenotype is a reflection of various genetic components interacting in a complex network. In fact, genetic diseases, like any other phenotype, occur as a result of various genes working in sync with each other in a single or several biological module(s). Using a genetic algorithm, our method tries to evolve communities containing the set of potential disease genes likely to be involved in a given genetic disease. Having a set of known disease genes, we first obtain a protein-protein interaction (PPI) network containing all the known disease genes. All the other genes inside the procured PPI network are then considered as candidate disease genes as they lie in the vicinity of the known disease genes in the network. Our method attempts to find communities of potential disease genes strongly working with one another and with the set of known disease genes. As a proof of concept, we tested our approach on 16 breast cancer genes and 15 Parkinson's Disease genes. We obtained comparable or better results than CIPHER, ENDEAVOUR and GPEC, three of the most reliable and frequently used disease-gene ranking frameworks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex networks are systems of entities that are interconnected through meaningful relationships. The result of the relations between entities forms a structure that has a statistical complexity that is not formed by random chance. In the study of complex networks, many graph models have been proposed to model the behaviours observed. However, constructing graph models manually is tedious and problematic. Many of the models proposed in the literature have been cited as having inaccuracies with respect to the complex networks they represent. However, recently, an approach that automates the inference of graph models was proposed by Bailey [10] The proposed methodology employs genetic programming (GP) to produce graph models that approximate various properties of an exemplary graph of a targeted complex network. However, there is a great deal already known about complex networks, in general, and often specific knowledge is held about the network being modelled. The knowledge, albeit incomplete, is important in constructing a graph model. However it is difficult to incorporate such knowledge using existing GP techniques. Thus, this thesis proposes a novel GP system which can incorporate incomplete expert knowledge that assists in the evolution of a graph model. Inspired by existing graph models, an abstract graph model was developed to serve as an embryo for inferring graph models of some complex networks. The GP system and abstract model were used to reproduce well-known graph models. The results indicated that the system was able to evolve models that produced networks that had structural similarities to the networks generated by the respective target models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis focuses on developing an evolutionary art system using genetic programming. The main goal is to produce new forms of evolutionary art that filter existing images into new non-photorealistic (NPR) styles, by obtaining images that look like traditional media such as watercolor or pencil, as well as brand new effects. The approach permits GP to generate creative forms of NPR results. The GP language is extended with different techniques and methods inspired from NPR research such as colour mixing expressions, image processing filters and painting algorithm. Colour mixing is a major new contribution, as it enables many familiar and innovative NPR effects to arise. Another major innovation is that many GP functions process the canvas (rendered image), while is dynamically changing. Automatic fitness scoring uses aesthetic evaluation models and statistical analysis, and multi-objective fitness evaluation is used. Results showed a variety of NPR effects, as well as new, creative possibilities.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Complex networks can arise naturally and spontaneously from all things that act as a part of a larger system. From the patterns of socialization between people to the way biological systems organize themselves, complex networks are ubiquitous, but are currently poorly understood. A number of algorithms, designed by humans, have been proposed to describe the organizational behaviour of real-world networks. Consequently, breakthroughs in genetics, medicine, epidemiology, neuroscience, telecommunications and the social sciences have recently resulted. The algorithms, called graph models, represent significant human effort. Deriving accurate graph models is non-trivial, time-intensive, challenging and may only yield useful results for very specific phenomena. An automated approach can greatly reduce the human effort required and if effective, provide a valuable tool for understanding the large decentralized systems of interrelated things around us. To the best of the author's knowledge this thesis proposes the first method for the automatic inference of graph models for complex networks with varied properties, with and without community structure. Furthermore, to the best of the author's knowledge it is the first application of genetic programming for the automatic inference of graph models. The system and methodology was tested against benchmark data, and was shown to be capable of reproducing close approximations to well-known algorithms designed by humans. Furthermore, when used to infer a model for real biological data the resulting model was more representative than models currently used in the literature.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and deterministic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel metaheuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS metaheuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A feature-based fitness function is applied in a genetic programming system to synthesize stochastic gene regulatory network models whose behaviour is defined by a time course of protein expression levels. Typically, when targeting time series data, the fitness function is based on a sum-of-errors involving the values of the fluctuating signal. While this approach is successful in many instances, its performance can deteriorate in the presence of noise. This thesis explores a fitness measure determined from a set of statistical features characterizing the time series' sequence of values, rather than the actual values themselves. Through a series of experiments involving symbolic regression with added noise and gene regulatory network models based on the stochastic 'if-calculus, it is shown to successfully target oscillating and non-oscillating signals. This practical and versatile fitness function offers an alternate approach, worthy of consideration for use in algorithms that evaluate noisy or stochastic behaviour.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The (n, k)-star interconnection network was proposed in 1995 as an attractive alternative to the n-star topology in parallel computation. The (n, k )-star has significant advantages over the n-star which itself was proposed as an attractive alternative to the popular hypercube. The major advantage of the (n, k )-star network is its scalability, which makes it more flexible than the n-star as an interconnection network. In this thesis, we will focus on finding graph theoretical properties of the (n, k )-star as well as developing parallel algorithms that run on this network. The basic topological properties of the (n, k )-star are first studied. These are useful since they can be used to develop efficient algorithms on this network. We then study the (n, k )-star network from algorithmic point of view. Specifically, we will investigate both fundamental and application algorithms for basic communication, prefix computation, and sorting, etc. A literature review of the state-of-the-art in relation to the (n, k )-star network as well as some open problems in this area are also provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The (n, k)-arrangement interconnection topology was first introduced in 1992. The (n, k )-arrangement graph is a class of generalized star graphs. Compared with the well known n-star, the (n, k )-arrangement graph is more flexible in degree and diameter. However, there are few algorithms designed for the (n, k)-arrangement graph up to present. In this thesis, we will focus on finding graph theoretical properties of the (n, k)- arrangement graph and developing parallel algorithms that run on this network. The topological properties of the arrangement graph are first studied. They include the cyclic properties. We then study the problems of communication: broadcasting and routing. Embedding problems are also studied later on. These are very useful to develop efficient algorithms on this network. We then study the (n, k )-arrangement network from the algorithmic point of view. Specifically, we will investigate both fundamental and application algorithms such as prefix sums computation, sorting, merging and basic geometry computation: finding convex hull on the (n, k )-arrangement graph. A literature review of the state-of-the-art in relation to the (n, k)-arrangement network is also provided, as well as some open problems in this area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding the machinery of gene regulation to control gene expression has been one of the main focuses of bioinformaticians for years. We use a multi-objective genetic algorithm to evolve a specialized version of side effect machines for degenerate motif discovery. We compare some suggested objectives for the motifs they find, test different multi-objective scoring schemes and probabilistic models for the background sequence models and report our results on a synthetic dataset and some biological benchmarking suites. We conclude with a comparison of our algorithm with some widely used motif discovery algorithms in the literature and suggest future directions for research in this area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are many known taste receptors specific to each taste attribute. This thesis examines the relationship between single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) in known taste and taste pathway receptors TAS2R38, Gustin, and TRPM5 and for PROP (6-n-propylthiouracil) taster status (PTS), thermal taster status (TTS), and orosensory sensation intensity ratings. PTS is a proxy for general taste responsiveness, and the ability to taste PROP classifies individuals into three phenotypes: super (PST), medium (PMT), and non-tasters (PNT). Another taste phenotype, also serving as a proxy for general taste responsiveness, is TTS, classifying individuals as thermal tasters (TTs) or thermal non-tasters (TnTs). DNA extractions from buccal cells obtained from 60 individuals were performed and analysis of TAS2R38, Gustin, and TRPM5 variations were conducted through Polymerase Chain Reaction (PCR), sequencing for SNPs, and upQMPSF for CNV analysis of TRPM5. Among the SNPs and CNVs studied, only TAS2R38 was found to be significantly associated with PTS and intensity ratings for sweet, bitter, and sour taste as well as astringency. However, not all PROP phenotypic differences can be explained by the variations at these three SNP sites in TAS2R38, suggesting the involvement of additional genes. No association was found between TTS and TAS2R38 or Gustin, confirming that PTS and TTS are not genetically associated. The examined TRPM5 SNPs and CNVs did not correlate with TTS. Therefore, further research is necessary into other factors contributing to PTS and TTS.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances.