16 resultados para random search algorithms
em Aston University Research Archive
Resumo:
Are the learning procedures of genetic algorithms (GAs) able to generate optimal architectures for artificial neural networks (ANNs) in high frequency data? In this experimental study,GAs are used to identify the best architecture for ANNs. Additional learning is undertaken by the ANNs to forecast daily excess stock returns. No ANN architectures were able to outperform a random walk,despite the finding of non-linearity in the excess returns. This failure is attributed to the absence of suitable ANN structures and further implies that researchers need to be cautious when making inferences from ANN results that use high frequency data.
Resumo:
We consider a variation of the prototype combinatorial optimization problem known as graph colouring. Our optimization goal is to colour the vertices of a graph with a fixed number of colours, in a way to maximize the number of different colours present in the set of nearest neighbours of each given vertex. This problem, which we pictorially call palette-colouring, has been recently addressed as a basic example of a problem arising in the context of distributed data storage. Even though it has not been proved to be NP-complete, random search algorithms find the problem hard to solve. Heuristics based on a naive belief propagation algorithm are observed to work quite well in certain conditions. In this paper, we build upon the mentioned result, working out the correct belief propagation algorithm, which needs to take into account the many-body nature of the constraints present in this problem. This method improves the naive belief propagation approach at the cost of increased computational effort. We also investigate the emergence of a satisfiable-to-unsatisfiable 'phase transition' as a function of the vertex mean degree, for different ensembles of sparse random graphs in the large size ('thermodynamic') limit.
Resumo:
Real world search problems, characterised by nonlinearity, noise and multidimensionality, are often best solved by hybrid algorithms. Techniques embodying different necessary features are triggered at specific iterations, in response to the current state of the problem space. In the existing literature, this alternation is managed either statically (through pre-programmed policies) or dynamically, at the cost of high coupling with algorithm inner representation. We extract two design patterns for hybrid metaheuristic search algorithms, the All-Seeing Eye and the Commentator patterns, which we argue should be replaced by the more flexible and loosely coupled Simple Black Box (Two-B) and Utility-based Black Box (Three-B) patterns that we propose here. We recommend the Two-B pattern for purely fitness based hybridisations and the Three-B pattern for more generic search quality evaluation based hybridisations.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.
Resumo:
A new approach to optimisation is introduced based on a precise probabilistic statement of what is ideally required of an optimisation method. It is convenient to express the formalism in terms of the control of a stationary environment. This leads to an objective function for the controller which unifies the objectives of exploration and exploitation, thereby providing a quantitative principle for managing this trade-off. This is demonstrated using a variant of the multi-armed bandit problem. This approach opens new possibilities for optimisation algorithms, particularly by using neural network or other adaptive methods for the adaptive controller. It also opens possibilities for deepening understanding of existing methods. The realisation of these possibilities requires research into practical approximations of the exact formalism.
Resumo:
A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.
Resumo:
We study a variation of the graph coloring problem on random graphs of finite average connectivity. Given the number of colors, we aim to maximize the number of different colors at neighboring vertices (i.e. one edge distance) of any vertex. Two efficient algorithms, belief propagation and Walksat are adapted to carry out this task. We present experimental results based on two types of random graphs for different system sizes and identify the critical value of the connectivity for the algorithms to find a perfect solution. The problem and the suggested algorithms have practical relevance since various applications, such as distributed storage, can be mapped onto this problem.
Resumo:
Molecular transport in phase space is crucial for chemical reactions because it defines how pre-reactive molecular configurations are found during the time evolution of the system. Using Molecular Dynamics (MD) simulated atomistic trajectories we test the assumption of the normal diffusion in the phase space for bulk water at ambient conditions by checking the equivalence of the transport to the random walk model. Contrary to common expectations we have found that some statistical features of the transport in the phase space differ from those of the normal diffusion models. This implies a non-random character of the path search process by the reacting complexes in water solutions. Our further numerical experiments show that a significant long period of non-stationarity in the transition probabilities of the segments of molecular trajectories can account for the observed non-uniform filling of the phase space. Surprisingly, the characteristic periods in the model non-stationarity constitute hundreds of nanoseconds, that is much longer time scales compared to typical lifetime of known liquid water molecular structures (several picoseconds).
Resumo:
The Fibre Distributed Data Interface (FDDI) represents the new generation of local area networks (LANs). These high speed LANs are capable of supporting up to 500 users over a 100 km distance. User traffic is expected to be as diverse as file transfers, packet voice and video. As the proliferation of FDDI LANs continues, the need to interconnect these LANs arises. FDDI LAN interconnection can be achieved in a variety of different ways. Some of the most commonly used today are public data networks, dial up lines and private circuits. For applications that can potentially generate large quantities of traffic, such as an FDDI LAN, it is cost effective to use a private circuit leased from the public carrier. In order to send traffic from one LAN to another across the leased line, a routing algorithm is required. Much research has been done on the Bellman-Ford algorithm and many implementations of it exist in computer networks. However, due to its instability and problems with routing table loops it is an unsatisfactory algorithm for interconnected FDDI LANs. A new algorithm, termed ISIS which is being standardized by the ISO provides a far better solution. ISIS will be implemented in many manufacturers routing devices. In order to make the work as practical as possible, this algorithm will be used as the basis for all the new algorithms presented. The ISIS algorithm can be improved by exploiting information that is dropped by that algorithm during the calculation process. A new algorithm, called Down Stream Path Splits (DSPS), uses this information and requires only minor modification to some of the ISIS routing procedures. DSPS provides a higher network performance, with very little additional processing and storage requirements. A second algorithm, also based on the ISIS algorithm, generates a massive increase in network performance. This is achieved by selecting alternative paths through the network in times of heavy congestion. This algorithm may select the alternative path at either the originating node, or any node along the path. It requires more processing and memory storage than DSPS, but generates a higher network power. The final algorithm combines the DSPS algorithm with the alternative path algorithm. This is the most flexible and powerful of the algorithms developed. However, it is somewhat complex and requires a fairly large storage area at each node. The performance of the new routing algorithms is tested in a comprehensive model of interconnected LANs. This model incorporates the transport through physical layers and generates random topologies for routing algorithm performance comparisons. Using this model it is possible to determine which algorithm provides the best performance without introducing significant complexity and storage requirements.
Resumo:
Orthogonal frequency division multiplexing (OFDM) is becoming a fundamental technology in future generation wireless communications. Call admission control is an effective mechanism to guarantee resilient, efficient, and quality-of-service (QoS) services in wireless mobile networks. In this paper, we present several call admission control algorithms for OFDM-based wireless multiservice networks. Call connection requests are differentiated into narrow-band calls and wide-band calls. For either class of calls, the traffic process is characterized as batch arrival since each call may request multiple subcarriers to satisfy its QoS requirement. The batch size is a random variable following a probability mass function (PMF) with realistically maximum value. In addition, the service times for wide-band and narrow-band calls are different. Following this, we perform a tele-traffic queueing analysis for OFDM-based wireless multiservice networks. The formulae for the significant performance metrics call blocking probability and bandwidth utilization are developed. Numerical investigations are presented to demonstrate the interaction between key parameters and performance metrics. The performance tradeoff among different call admission control algorithms is discussed. Moreover, the analytical model has been validated by simulation. The methodology as well as the result provides an efficient tool for planning next-generation OFDM-based broadband wireless access systems.
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.
Resumo:
Market mechanisms are a means by which resources in contention can be allocated between contending parties, both in human economies and those populated by software agents. Designing such mechanisms has traditionally been carried out by hand, and more recently by automation. Assessing these mechanisms typically involves them being evaluated with respect to multiple conflicting objectives, which can often be nonlinear, noisy, and expensive to compute. For typical performance objectives, it is known that designed mechanisms often fall short on being optimal across all objectives simultaneously. However, in all previous automated approaches, either only a single objective is considered, or else the multiple performance objectives are combined into a single objective. In this paper we do not aggregate objectives, instead considering a direct, novel application of multi-objective evolutionary algorithms (MOEAs) to the problem of automated mechanism design. This allows the automatic discovery of trade-offs that such objectives impose on mechanisms. We pose the problem of mechanism design, specifically for the class of linear redistribution mechanisms, as a naturally existing multi-objective optimisation problem. We apply a modified version of NSGA-II in order to design mechanisms within this class, given economically relevant objectives such as welfare and fairness. This application of NSGA-II exposes tradeoffs between objectives, revealing relationships between them that were otherwise unknown for this mechanism class. The understanding of the trade-off gained from the application of MOEAs can thus help practitioners with an insightful application of discovered mechanisms in their respective real/artificial markets.
Resumo:
This research focuses on automatically adapting a search engine size in response to fluctuations in query workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computer resources to or from the engine. Our solution is to contribute an adaptive search engine that will repeatedly re-evaluate its load and, when appropriate, switch over to a dierent number of active processors. We focus on three aspects and break them out into three sub-problems as follows: Continually determining the Number of Processors (CNP), New Grouping Problem (NGP) and Regrouping Order Problem (ROP). CNP means that (in the light of the changes in the query workload in the search engine) there is a problem of determining the ideal number of processors p active at any given time to use in the search engine and we call this problem CNP. NGP happens when changes in the number of processors are determined and it must also be determined which groups of search data will be distributed across the processors. ROP is how to redistribute this data onto processors while keeping the engine responsive and while also minimising the switchover time and the incurred network load. We propose solutions for these sub-problems. For NGP we propose an algorithm for incrementally adjusting the index to t the varying number of virtual machines. For ROP we present an ecient method for redistributing data among processors while keeping the search engine responsive. Regarding the solution for CNP, we propose an algorithm determining the new size of the search engine by re-evaluating its load. We tested the solution performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud. Our experiments show that when we compare our NGP solution with computing the index from scratch, the incremental algorithm speeds up the index computation 2{10 times while maintaining a similar search performance. The chosen redistribution method is 25% to 50% faster than other methods and reduces the network load around by 30%. For CNP we present a deterministic algorithm that shows a good ability to determine a new size of search engine. When combined, these algorithms give an adapting algorithm that is able to adjust the search engine size with a variable workload.