867 resultados para Genetic Algorithm for Rule-Set Prediction (GARP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

An extensive set of machine learning and pattern classification techniques trained and tested on KDD dataset failed in detecting most of the user-to-root attacks. This paper aims to provide an approach for mitigating negative aspects of the mentioned dataset, which led to low detection rates. Genetic algorithm is employed to implement rules for detecting various types of attacks. Rules are formed of the features of the dataset identified as the most important ones for each attack type. In this way we introduce high level of generality and thus achieve high detection rates, but also gain high reduction of the system training time. Thenceforth we re-check the decision of the user-to- root rules with the rules that detect other types of attacks. In this way we decrease the false-positive rate. The model was verified on KDD 99, demonstrating higher detection rates than those reported by the state- of-the-art while maintaining low false-positive rate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A common problem in many data based modelling algorithms such as associative memory networks is the problem of the curse of dimensionality. In this paper, a new two-stage neurofuzzy system design and construction algorithm (NeuDeC) for nonlinear dynamical processes is introduced to effectively tackle this problem. A new simple preprocessing method is initially derived and applied to reduce the rule base, followed by a fine model detection process based on the reduced rule set by using forward orthogonal least squares model structure detection. In both stages, new A-optimality experimental design-based criteria we used. In the preprocessing stage, a lower bound of the A-optimality design criterion is derived and applied as a subset selection metric, but in the later stage, the A-optimality design criterion is incorporated into a new composite cost function that minimises model prediction error as well as penalises the model parameter variance. The utilisation of NeuDeC leads to unbiased model parameters with low parameter variance and the additional benefit of a parsimonious model structure. Numerical examples are included to demonstrate the effectiveness of this new modelling approach for high dimensional inputs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Capacitated Centered Clustering Problem (CCCP) consists of defining a set of p groups with minimum dissimilarity on a network with n points. Demand values are associated with each point and each group has a demand capacity. The problem is well known to be NP-hard and has many practical applications. In this paper, the hybrid method Clustering Search (CS) is implemented to solve the CCCP. This method identifies promising regions of the search space by generating solutions with a metaheuristic, such as Genetic Algorithm, and clustering them into clusters that are then explored further with local search heuristics. Computational results considering instances available in the literature are presented to demonstrate the efficacy of CS. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, genetic algorithms concepts along with a rotamer library for proteins side chains and implicit solvation potential are used to optimize the tertiary structure of peptides. We starting from the known PDB structure of its backbone which is kept fixed while the side chains allowed adopting the conformations present in the rotamer library. It was used rotamer library independent of backbone and a implicit solvation potential. The structure of Mastoporan-X was predicted using several force fields with a growing complexity; we started it with a field where the only present interaction was Lennard-Jones. We added the Coulombian term and we considered the solvation effects through a term proportional to the solvent accessible area. This paper present good and interesting results obtained using the potential with solvation term and rotamer library. Hence, the algorithm (called YODA) presented here can be a good tool to the prediction problem. (c) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper studies the use of different population structures in a Genetic Algorithm (GA) applied to lot sizing and scheduling problems. The population approaches are divided into two types: single-population and multi-population. The first type has a non-structured single population. The multi-population type presents non-structured and structured populations organized in binary and ternary trees. Each population approach is tested on lot sizing and scheduling problems found in soft drink companies. These problems have two interdependent levels with decisions concerning raw material storage and soft drink bottling. The challenge is to simultaneously determine the lot sizing and scheduling of raw materials in tanks and products in lines. Computational results are reported allowing determining the better population structure for the set of problem instances evaluated. Copyright 2008 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a fuzzy classification system for the risk of infestation by weeds in agricultural zones considering the variability of weeds. The inputs of the system are features of the infestation extracted from estimated maps by kriging for the weed seed production and weed coverage, and from the competitiveness, inferred from narrow and broad-leaved weeds. Furthermore, a Bayesian network classifier is used to extract rules from data which are compared to the fuzzy rule set obtained on the base of specialist knowledge. Results for the risk inference in a maize crop field are presented and evaluated by the estimated yield loss. © 2009 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work develops two approaches based on the fuzzy set theory to solve a class of fuzzy mathematical optimization problems with uncertainties in the objective function and in the set of constraints. The first approach is an adaptation of an iterative method that obtains cut levels and later maximizes the membership function of fuzzy decision making using the bound search method. The second one is a metaheuristic approach that adapts a standard genetic algorithm to use fuzzy numbers. Both approaches use a decision criterion called satisfaction level that reaches the best solution in the uncertain environment. Selected examples from the literature are presented to compare and to validate the efficiency of the methods addressed, emphasizing the fuzzy optimization problem in some import-export companies in the south of Spain. © 2012 Brazilian Operations Research Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Connectivity is the basic factor for the proper operation of any wireless network. In a mobile wireless sensor network it is a challenge for applications and protocols to deal with connectivity problems, as links might get up and down frequently. In these scenarios, having knowledge of the node remaining connectivity time could both improve the performance of the protocols (e.g. handoff mechanisms) and save possible scarce nodes resources (CPU, bandwidth, and energy) by preventing unfruitful transmissions. The current paper provides a solution called Genetic Machine Learning Algorithm (GMLA) to forecast the remainder connectivity time in mobile environments. It consists in combining Classifier Systems with a Markov chain model of the RF link quality. The main advantage of using an evolutionary approach is that the Markov model parameters can be discovered on-the-fly, making it possible to cope with unknown environments and mobility patterns. Simulation results show that the proposal is a very suitable solution, as it overcomes the performance obtained by similar approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the m-machine no-wait flow shop problem where the set-up time of a job is separated from its processing time. The performance measure considered is the total flowtime. A new hybrid metaheuristic Genetic Algorithm-Cluster Search is proposed to solve the scheduling problem. The performance of the proposed method is evaluated and the results are compared with the best method reported in the literature. Experimental tests show superiority of the new method for the test problems set, regarding the solution quality. (c) 2012 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of optimal design of a multi-gravity-assist space trajectories, with free number of deep space maneuvers (MGADSM) poses multi-modal cost functions. In the general form of the problem, the number of design variables is solution dependent. To handle global optimization problems where the number of design variables varies from one solution to another, two novel genetic-based techniques are introduced: hidden genes genetic algorithm (HGGA) and dynamic-size multiple population genetic algorithm (DSMPGA). In HGGA, a fixed length for the design variables is assigned for all solutions. Independent variables of each solution are divided into effective and ineffective (hidden) genes. Hidden genes are excluded in cost function evaluations. Full-length solutions undergo standard genetic operations. In DSMPGA, sub-populations of fixed size design spaces are randomly initialized. Standard genetic operations are carried out for a stage of generations. A new population is then created by reproduction from all members based on their relative fitness. The resulting sub-populations have different sizes from their initial sizes. The process repeats, leading to increasing the size of sub-populations of more fit solutions. Both techniques are applied to several MGADSM problems. They have the capability to determine the number of swing-bys, the planets to swing by, launch and arrival dates, and the number of deep space maneuvers as well as their locations, magnitudes, and directions in an optimal sense. The results show that solutions obtained using the developed tools match known solutions for complex case studies. The HGGA is also used to obtain the asteroids sequence and the mission structure in the global trajectory optimization competition (GTOC) problem. As an application of GA optimization to Earth orbits, the problem of visiting a set of ground sites within a constrained time frame is solved. The J2 perturbation and zonal coverage are considered to design repeated Sun-synchronous orbits. Finally, a new set of orbits, the repeated shadow track orbits (RSTO), is introduced. The orbit parameters are optimized such that the shadow of a spacecraft on the Earth visits the same locations periodically every desired number of days.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ein auf Basis von Prozessdaten kalibriertes Viskositätsmodell wird vorgeschlagen und zur Vorhersage der Viskosität einer Polyamid 12 (PA12) Kunststoffschmelze als Funktion von Zeit, Temperatur und Schergeschwindigkeit angewandt. Im ersten Schritt wurde das Viskositätsmodell aus experimentellen Daten abgeleitet. Es beruht hauptsächlich auf dem drei-parametrigen Ansatz von Carreau, wobei zwei zusätzliche Verschiebungsfaktoren eingesetzt werden. Die Temperaturabhängigkeit der Viskosität wird mithilfe des Verschiebungsfaktors aT von Arrhenius berücksichtigt. Ein weiterer Verschiebungsfaktor aSC (Structural Change) wird eingeführt, der die Strukturänderung von PA12 als Folge der Prozessbedingungen beim Lasersintern beschreibt. Beobachtet wurde die Strukturänderung in Form einer signifikanten Viskositätserhöhung. Es wurde geschlussfolgert, dass diese Viskositätserhöhung auf einen Molmassenaufbau zurückzuführen ist und als Nachkondensation verstanden werden kann. Abhängig von den Zeit- und Temperaturbedingungen wurde festgestellt, dass die Viskosität als Folge des Molmassenaufbaus exponentiell gegen eine irreversible Grenze strebt. Die Geschwindigkeit dieser Nachkondensation ist zeit- und temperaturabhängig. Es wird angenommen, dass die Pulverbetttemperatur einen Molmassenaufbau verursacht und es damit zur Kettenverlängerung kommt. Dieser fortschreitende Prozess der zunehmenden Kettenlängen setzt molekulare Beweglichkeit herab und unterbindet die weitere Nachkondensation. Der Verschiebungsfaktor aSC drückt diese physikalisch-chemische Modellvorstellung aus und beinhaltet zwei zusätzliche Parameter. Der Parameter aSC,UL entspricht der oberen Viskositätsgrenze, wohingegen k0 die Strukturänderungsrate angibt. Es wurde weiterhin festgestellt, dass es folglich nützlich ist zwischen einer Fließaktivierungsenergie und einer Strukturänderungsaktivierungsenergie für die Berechnung von aT und aSC zu unterscheiden. Die Optimierung der Modellparameter erfolgte mithilfe eines genetischen Algorithmus. Zwischen berechneten und gemessenen Viskositäten wurde eine gute Übereinstimmung gefunden, so dass das Viskositätsmodell in der Lage ist die Viskosität einer PA12 Kunststoffschmelze als Folge eines kombinierten Lasersinter Zeit- und Temperatureinflusses vorherzusagen. Das Modell wurde im zweiten Schritt angewandt, um die Viskosität während des Lasersinter-Prozesses in Abhängigkeit von der Energiedichte zu berechnen. Hierzu wurden Prozessdaten, wie Schmelzetemperatur und Belichtungszeit benutzt, die mithilfe einer High-Speed Thermografiekamera on-line gemessen wurden. Abschließend wurde der Einfluss der Strukturänderung auf das Viskositätsniveau im Prozess aufgezeigt.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AnewRelativisticScreenedHydrogenicModel has been developed to calculate atomic data needed to compute the optical and thermodynamic properties of high energy density plasmas. The model is based on anewset of universal screeningconstants, including nlj-splitting that has been obtained by fitting to a large database of ionization potentials and excitation energies. This database was built with energies compiled from the National Institute of Standards and Technology (NIST) database of experimental atomic energy levels, and energies calculated with the Flexible Atomic Code (FAC). The screeningconstants have been computed up to the 5p3/2 subshell using a Genetic Algorithm technique with an objective function designed to minimize both the relative error and the maximum error. To select the best set of screeningconstants some additional physical criteria has been applied, which are based on the reproduction of the filling order of the shells and on obtaining the best ground state configuration. A statistical error analysis has been performed to test the model, which indicated that approximately 88% of the data lie within a ±10% error interval. We validate the model by comparing the results with ionization energies, transition energies, and wave functions computed using sophisticated self-consistent codes and experimental data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a novel approach for character recognition has been presented with the help of genetic operators which have evolved from biological genetics and help us to achieve highly accurate results. A genetic algorithm approach has been described in which the biological haploid chromosomes have been implemented using a single row bit pattern of 315 values which have been operated upon by various genetic operators. A set of characters are taken as an initial population from which various new generations of characters are generated with the help of selection, crossover and mutation. Variations of population of characters are evolved from which the fittest solution is found by subjecting the various populations to a new fitness function developed. The methodology works and reduces the dissimilarity coefficient found by the fitness function between the character to be recognized and members of the populations and on reaching threshold limit of the error found from dissimilarity, it recognizes the character. As the new population is being generated from the older population, traits are passed on from one generation to another. We present a methodology with the help of which we are able to achieve highly efficient character recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To carry out their specific roles in the cell, genes and gene products often work together in groups, forming many relationships among themselves and with other molecules. Such relationships include physical protein-protein interaction relationships, regulatory relationships, metabolic relationships, genetic relationships, and much more. With advances in science and technology, some high throughput technologies have been developed to simultaneously detect tens of thousands of pairwise protein-protein interactions and protein-DNA interactions. However, the data generated by high throughput methods are prone to noise. Furthermore, the technology itself has its limitations, and cannot detect all kinds of relationships between genes and their products. Thus there is a pressing need to investigate all kinds of relationships and their roles in a living system using bioinformatic approaches, and is a central challenge in Computational Biology and Systems Biology. This dissertation focuses on exploring relationships between genes and gene products using bioinformatic approaches. Specifically, we consider problems related to regulatory relationships, protein-protein interactions, and semantic relationships between genes. A regulatory element is an important pattern or "signal", often located in the promoter of a gene, which is used in the process of turning a gene "on" or "off". Predicting regulatory elements is a key step in exploring the regulatory relationships between genes and gene products. In this dissertation, we consider the problem of improving the prediction of regulatory elements by using comparative genomics data. With regard to protein-protein interactions, we have developed bioinformatics techniques to estimate support for the data on these interactions. While protein-protein interactions and regulatory relationships can be detected by high throughput biological techniques, there is another type of relationship called semantic relationship that cannot be detected by a single technique, but can be inferred using multiple sources of biological data. The contributions of this thesis involved the development and application of a set of bioinformatic approaches that address the challenges mentioned above. These included (i) an EM-based algorithm that improves the prediction of regulatory elements using comparative genomics data, (ii) an approach for estimating the support of protein-protein interaction data, with application to functional annotation of genes, (iii) a novel method for inferring functional network of genes, and (iv) techniques for clustering genes using multi-source data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Scheduling problems are generally NP-hard combinatorial problems, and a lot of research has been done to solve these problems heuristically. However, most of the previous approaches are problem-specific and research into the development of a general scheduling algorithm is still in its infancy. Mimicking the natural evolutionary process of the survival of the fittest, Genetic Algorithms (GAs) have attracted much attention in solving difficult scheduling problems in recent years. Some obstacles exist when using GAs: there is no canonical mechanism to deal with constraints, which are commonly met in most real-world scheduling problems, and small changes to a solution are difficult. To overcome both difficulties, indirect approaches have been presented (in [1] and [2]) for nurse scheduling and driver scheduling, where GAs are used by mapping the solution space, and separate decoding routines then build solutions to the original problem. In our previous indirect GAs, learning is implicit and is restricted to the efficient adjustment of weights for a set of rules that are used to construct schedules. The major limitation of those approaches is that they learn in a non-human way: like most existing construction algorithms, once the best weight combination is found, the rules used in the construction process are fixed at each iteration. However, normally a long sequence of moves is needed to construct a schedule and using fixed rules at each move is thus unreasonable and not coherent with human learning processes. When a human scheduler is working, he normally builds a schedule step by step following a set of rules. After much practice, the scheduler gradually masters the knowledge of which solution parts go well with others. He can identify good parts and is aware of the solution quality even if the scheduling process is not completed yet, thus having the ability to finish a schedule by using flexible, rather than fixed, rules. In this research we intend to design more human-like scheduling algorithms, by using ideas derived from Bayesian Optimization Algorithms (BOA) and Learning Classifier Systems (LCS) to implement explicit learning from past solutions. BOA can be applied to learn to identify good partial solutions and to complete them by building a Bayesian network of the joint distribution of solutions [3]. A Bayesian network is a directed acyclic graph with each node corresponding to one variable, and each variable corresponding to individual rule by which a schedule will be constructed step by step. The conditional probabilities are computed according to an initial set of promising solutions. Subsequently, each new instance for each node is generated by using the corresponding conditional probabilities, until values for all nodes have been generated. Another set of rule strings will be generated in this way, some of which will replace previous strings based on fitness selection. If stopping conditions are not met, the Bayesian network is updated again using the current set of good rule strings. The algorithm thereby tries to explicitly identify and mix promising building blocks. It should be noted that for most scheduling problems the structure of the network model is known and all the variables are fully observed. In this case, the goal of learning is to find the rule values that maximize the likelihood of the training data. Thus learning can amount to 'counting' in the case of multinomial distributions. In the LCS approach, each rule has its strength showing its current usefulness in the system, and this strength is constantly assessed [4]. To implement sophisticated learning based on previous solutions, an improved LCS-based algorithm is designed, which consists of the following three steps. The initialization step is to assign each rule at each stage a constant initial strength. Then rules are selected by using the Roulette Wheel strategy. The next step is to reinforce the strengths of the rules used in the previous solution, keeping the strength of unused rules unchanged. The selection step is to select fitter rules for the next generation. It is envisaged that the LCS part of the algorithm will be used as a hill climber to the BOA algorithm. This is exciting and ambitious research, which might provide the stepping-stone for a new class of scheduling algorithms. Data sets from nurse scheduling and mall problems will be used as test-beds. It is envisaged that once the concept has been proven successful, it will be implemented into general scheduling algorithms. It is also hoped that this research will give some preliminary answers about how to include human-like learning into scheduling algorithms and may therefore be of interest to researchers and practitioners in areas of scheduling and evolutionary computation. References 1. Aickelin, U. and Dowsland, K. (2003) 'Indirect Genetic Algorithm for a Nurse Scheduling Problem', Computer & Operational Research (in print). 2. Li, J. and Kwan, R.S.K. (2003), 'Fuzzy Genetic Algorithm for Driver Scheduling', European Journal of Operational Research 147(2): 334-344. 3. Pelikan, M., Goldberg, D. and Cantu-Paz, E. (1999) 'BOA: The Bayesian Optimization Algorithm', IlliGAL Report No 99003, University of Illinois. 4. Wilson, S. (1994) 'ZCS: A Zeroth-level Classifier System', Evolutionary Computation 2(1), pp 1-18.