23 resultados para rules application algorithms
em CentAUR: Central Archive University of Reading - UK
Resumo:
Asynchronous Optical Sampling (ASOPS) [1,2] and frequency comb spectrometry [3] based on dual Ti:saphire resonators operated in a master/slave mode have the potential to improve signal to noise ratio in THz transient and IR sperctrometry. The multimode Brownian oscillator time-domain response function described by state-space models is a mathematically robust framework that can be used to describe the dispersive phenomena governed by Lorentzian, Debye and Drude responses. In addition, the optical properties of an arbitrary medium can be expressed as a linear combination of simple multimode Brownian oscillator functions. The suitability of a range of signal processing schemes adopted from the Systems Identification and Control Theory community for further processing the recorded THz transients in the time or frequency domain will be outlined [4,5]. Since a femtosecond duration pulse is capable of persistent excitation of the medium within which it propagates, such approach is perfectly justifiable. Several de-noising routines based on system identification will be shown. Furthermore, specifically developed apodization structures will be discussed. These are necessary because due to dispersion issues, the time-domain background and sample interferograms are non-symmetrical [6-8]. These procedures can lead to a more precise estimation of the complex insertion loss function. The algorithms are applicable to femtosecond spectroscopies across the EM spectrum. Finally, a methodology for femtosecond pulse shaping using genetic algorithms aiming to map and control molecular relaxation processes will be mentioned.
Resumo:
Genetic algorithms (GAs) have been introduced into site layout planning as reported in a number of studies. In these studies, the objective functions were defined so as to employ the GAs in searching for the optimal site layout. However, few studies have been carried out to investigate the actual closeness of relationships between site facilities; it is these relationships that ultimately govern the site layout. This study has determined that the underlying factors of site layout planning for medium-size projects include work flow, personnel flow, safety and environment, and personal preferences. By finding the weightings on these factors and the corresponding closeness indices between each facility, a closeness relationship has been deduced. Two contemporary mathematical approaches - fuzzy logic theory and an entropy measure - were adopted in finding these results in order to minimize the uncertainty and vagueness of the collected data and improve the quality of the information. GAs were then applied to searching for the optimal site layout in a medium-size government project using the GeneHunter software. The objective function involved minimizing the total travel distance. An optimal layout was obtained within a short time. This reveals that the application of GA to site layout planning is highly promising and efficient.
Resumo:
In this paper we explore classification techniques for ill-posed problems. Two classes are linearly separable in some Hilbert space X if they can be separated by a hyperplane. We investigate stable separability, i.e. the case where we have a positive distance between two separating hyperplanes. When the data in the space Y is generated by a compact operator A applied to the system states ∈ X, we will show that in general we do not obtain stable separability in Y even if the problem in X is stably separable. In particular, we show this for the case where a nonlinear classification is generated from a non-convergent family of linear classes in X. We apply our results to the problem of quality control of fuel cells where we classify fuel cells according to their efficiency. We can potentially classify a fuel cell using either some external measured magnetic field or some internal current. However we cannot measure the current directly since we cannot access the fuel cell in operation. The first possibility is to apply discrimination techniques directly to the measured magnetic fields. The second approach first reconstructs currents and then carries out the classification on the current distributions. We show that both approaches need regularization and that the regularized classifications are not equivalent in general. Finally, we investigate a widely used linear classification algorithm Fisher's linear discriminant with respect to its ill-posedness when applied to data generated via a compact integral operator. We show that the method cannot stay stable when the number of measurement points becomes large.
Resumo:
Twitter has become a dependable microblogging tool for real time information dissemination and newsworthy events broadcast. Its users sometimes break news on the network faster than traditional newsagents due to their presence at ongoing real life events at most times. Different topic detection methods are currently used to match Twitter posts to real life news of mainstream media. In this paper, we analyse tweets relating to the English FA Cup finals 2012 by applying our novel method named TRCM to extract association rules present in hash tag keywords of tweets in different time-slots. Our system identify evolving hash tag keywords with strong association rules in each time-slot. We then map the identified hash tag keywords to event highlights of the game as reported in the ground truth of the main stream media. The performance effectiveness measure of our experiments show that our method perform well as a Topic Detection and Tracking approach.
Resumo:
The authors present a systolic design for a simple GA mechanism which provides high throughput and unidirectional pipelining by exploiting the inherent parallelism in the genetic operators. The design computes in O(N+G) time steps using O(N2) cells where N is the population size and G is the chromosome length. The area of the device is independent of the chromosome length and so can be easily scaled by replicating the arrays or by employing fine-grain migration. The array is generic in the sense that it does not rely on the fitness function and can be used as an accelerator for any GA application using uniform crossover between pairs of chromosomes. The design can also be used in hybrid systems as an add-on to complement existing designs and methods for fitness function acceleration and island-style population management
Resumo:
This paper presents the results of the application of a parallel Genetic Algorithm (GA) in order to design a Fuzzy Proportional Integral (FPI) controller for active queue management on Internet routers. The Active Queue Management (AQM) policies are those policies of router queue management that allow the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. Two different parallel implementations of the genetic algorithm are adopted to determine an optimal configuration of the FPI controller parameters. Finally, the results of several experiments carried out on a forty nodes cluster of workstations are presented.
Resumo:
This paper describes a proposed new approach to the Computer Network Security Intrusion Detection Systems (NIDS) application domain knowledge processing focused on a topic map technology-enabled representation of features of the threat pattern space as well as the knowledge of situated efficacy of alternative candidate algorithms for pattern recognition within the NIDS domain. Thus an integrative knowledge representation framework for virtualisation, data intelligence and learning loop architecting in the NIDS domain is described together with specific aspects of its deployment.
Resumo:
An extensive set of machine learning and pattern classification techniques trained and tested on KDD dataset failed in detecting most of the user-to-root attacks. This paper aims to provide an approach for mitigating negative aspects of the mentioned dataset, which led to low detection rates. Genetic algorithm is employed to implement rules for detecting various types of attacks. Rules are formed of the features of the dataset identified as the most important ones for each attack type. In this way we introduce high level of generality and thus achieve high detection rates, but also gain high reduction of the system training time. Thenceforth we re-check the decision of the user-to- root rules with the rules that detect other types of attacks. In this way we decrease the false-positive rate. The model was verified on KDD 99, demonstrating higher detection rates than those reported by the state- of-the-art while maintaining low false-positive rate.
Resumo:
This paper investigates random number generators in stochastic iteration algorithms that require infinite uniform sequences. We take a simple model of the general transport equation and solve it with the application of a linear congruential generator, the Mersenne twister, the mother-of-all generators, and a true random number generator based on quantum effects. With this simple model we show that for reasonably contractive operators the theoretically not infinite-uniform sequences perform also well. Finally, we demonstrate the power of stochastic iteration for the solution of the light transport problem.
Resumo:
A novel optimising controller is designed that leads a slow process from a sub-optimal operational condition to the steady-state optimum in a continuous way based on dynamic information. Using standard results from optimisation theory and discrete optimal control, the solution of a steady-state optimisation problem is achieved by solving a receding-horizon optimal control problem which uses derivative and state information from the plant via a shadow model and a state-space identifier. The paper analyzes the steady-state optimality of the procedure, develops algorithms with and without control rate constraints and applies the procedure to a high fidelity simulation study of a distillation column optimisation.
Resumo:
Many recent papers have documented periodicities in returns, return volatility, bid–ask spreads and trading volume, in both equity and foreign exchange markets. We propose and employ a new test for detecting subtle periodicities in time series data based on a signal coherence function. The technique is applied to a set of seven half-hourly exchange rate series. Overall, we find the signal coherence to be maximal at the 8-h and 12-h frequencies. Retaining only the most coherent frequencies for each series, we implement a trading rule that is based on these observed periodicities. Our results demonstrate in all cases except one that, in gross terms, the rules can generate returns that are considerably greater than those of a buy-and-hold strategy, although they cannot retain their profitability net of transactions costs. We conjecture that this methodology could constitute an important tool for financial market researchers which will enable them to detect, quantify and rank the various periodic components in financial data better.
Resumo:
The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.
Resumo:
Inducing rules from very large datasets is one of the most challenging areas in data mining. Several approaches exist to scaling up classification rule induction to large datasets, namely data reduction and the parallelisation of classification rule induction algorithms. In the area of parallelisation of classification rule induction algorithms most of the work has been concentrated on the Top Down Induction of Decision Trees (TDIDT), also known as the ‘divide and conquer’ approach. However powerful alternative algorithms exist that induce modular rules. Most of these alternative algorithms follow the ‘separate and conquer’ approach of inducing rules, but very little work has been done to make the ‘separate and conquer’ approach scale better on large training data. This paper examines the potential of the recently developed blackboard based J-PMCRI methodology for parallelising modular classification rule induction algorithms that follow the ‘separate and conquer’ approach. A concrete implementation of the methodology is evaluated empirically on very large datasets.
Resumo:
The Prism family of algorithms induces modular classification rules which, in contrast to decision tree induction algorithms, do not necessarily fit together into a decision tree structure. Classifiers induced by Prism algorithms achieve a comparable accuracy compared with decision trees and in some cases even outperform decision trees. Both kinds of algorithms tend to overfit on large and noisy datasets and this has led to the development of pruning methods. Pruning methods use various metrics to truncate decision trees or to eliminate whole rules or single rule terms from a Prism rule set. For decision trees many pre-pruning and postpruning methods exist, however for Prism algorithms only one pre-pruning method has been developed, J-pruning. Recent work with Prism algorithms examined J-pruning in the context of very large datasets and found that the current method does not use its full potential. This paper revisits the J-pruning method for the Prism family of algorithms and develops a new pruning method Jmax-pruning, discusses it in theoretical terms and evaluates it empirically.