22 resultados para Learning in multi-agent systems
em Aston University Research Archive
Resumo:
Multi-agent systems are complex systems comprised of multiple intelligent agents that act either independently or in cooperation with one another. Agent-based modelling is a method for studying complex systems like economies, societies, ecologies etc. Due to their complexity, very often mathematical analysis is limited in its ability to analyse such systems. In this case, agent-based modelling offers a practical, constructive method of analysis. The objective of this book is to shed light on some emergent properties of multi-agent systems. The authors focus their investigation on the effect of knowledge exchange on the convergence of complex, multi-agent systems.
Resumo:
This work attempts to shed light to the fundamental concepts behind the stability of Multi-Agent Systems. We view the system as a discrete time Markov chain with a potentially unknown transitional probability distribution. The system will be considered to be stable when its state has converged to an equilibrium distribution. Faced with the non-trivial task of establishing the convergence to such a distribution, we propose a hypothesis testing approach according to which we test whether the convergence of a particular system metric has occurred. We describe some artificial multi-agent ecosystems that were developed and we present results based on these systems which confirm that this approach qualitatively agrees with our intuition.
Resumo:
To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.
Resumo:
The global market has become increasingly dynamic, unpredictable and customer-driven. This has led to rising rates of new product introduction and turbulent demand patterns across product mixes. As a result, manufacturing enterprises were facing mounting challenges to be agile and responsive to cope with market changes, so as to achieve the competitiveness of producing and delivering products to the market timely and cost-effectively. This paper introduces a currency-based iterative agent bidding mechanism to effectively and cost-efficiently integrate the activities associated with production planning and control, so as to achieve an optimised process plan and schedule. The aim is to enhance the agility of manufacturing systems to accommodate dynamic changes in the market and production. The iterative bidding mechanism is executed based on currency-like metrics; each operation to be performed is assigned with a virtual currency value and agents bid for the operation if they make a virtual profit based on this value. These currency values are optimised iteratively and so does the bidding process based on new sets of values. This is aimed at obtaining better and better production plans, leading to near-optimality. A genetic algorithm is proposed to optimise the currency values at each iteration. In this paper, the implementation of the mechanism and the test case simulation results are also discussed. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Digital back-propagation (DBP) has recently been proposed for the comprehensive compensation of channel nonlinearities in optical communication systems. While DBP is attractive for its flexibility and performance, it poses significant challenges in terms of computational complexity. Alternatively, phase conjugation or spectral inversion has previously been employed to mitigate nonlinear fibre impairments. Though spectral inversion is relatively straightforward to implement in optical or electrical domain, it requires precise positioning and symmetrised link power profile in order to avail the full benefit. In this paper, we directly compare ideal and low-precision single-channel DBP with single-channel spectral-inversion both with and without symmetry correction via dispersive chirping. We demonstrate that for all the dispersion maps studied, spectral inversion approaches the performance of ideal DBP with 40 steps per span and exceeds the performance of electronic dispersion compensation by ~3.5 dB in Q-factor, enabling up to 96% reduction in complexity in terms of required DBP stages, relative to low precision one step per span based DBP. For maps where quasi-phase matching is a significant issue, spectral inversion significantly outperforms ideal DBP by ~3 dB.
Resumo:
Spatial objects may not only be perceived visually but also by touch. We report recent experiments investigating to what extent prior object knowledge acquired in either the haptic or visual sensory modality transfers to a subsequent visual learning task. Results indicate that even mental object representations learnt in one sensory modality may attain a multi-modal quality. These findings seem incompatible with picture-based reasoning schemas but leave open the possibility of modality-specific reasoning mechanisms.
Resumo:
Right across Europe technology is playing a vital part in enhancing learning for an increasingly diverse population of learners. Learning is increasingly flexible, social and mobile and supported by high quality multi-media resources. Institutional VLEs are seeing a shift towards open source products and these core systems are supplemented by a range of social and collaborative learning tools based on web 2.0 technologies. Learners undertaking field studies and those in the workplace are coming to expect that these off-campus experiences will also be technology-rich whether supported by institutional or user-owned devices. As well as keeping European businesses competitive, learning is seen as a means of increasing social mobility and supporting an agenda of social justice. For a number of years the EUNIS E-Learning Task Force (ELTF) has conducted snapshot surveys of e-learning across member institutions, collected case studies of good practice in e-learning see (Hayes, et al., 2009) in references, supported a group looking at the future of e-learning, and showcased the best of innovation in its e-learning Award. Now for the first time the ELTF membership has come together to undertake an analysis of developments in the member states and to assess what this might mean for the future. The group applied the techniques of World Café conversation and Scenario Thinking to develop its thoughts. The analysis is unashamedly qualitative and draws on expertise from leading universities across eight of the EUNIS member states. What emerges is interesting in terms of the common trends in developments in all of the nations and similarities in hopes and concerns about the future development of learning.
Resumo:
Efficient new Bayesian inference technique is employed for studying critical properties of the Ising linear perceptron and for signal detection in code division multiple access (CDMA). The approach is based on a recently introduced message passing technique for densely connected systems. Here we study both critical and non-critical regimes. Results obtained in the non-critical regime give rise to a highly efficient signal detection algorithm in the context of CDMA; while in the critical regime one observes a first-order transition line that ends in a continuous phase transition point. Finite size effects are also studied. © 2006 Elsevier B.V. All rights reserved.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
Improving bit error rates in optical communication systems is a difficult and important problem. The error correction must take place at high speed and be extremely accurate. We show the feasibility of using hardware implementable machine learning techniques. This may enable some error correction at the speed required.
Resumo:
The loss of habitat and biodiversity worldwide has led to considerable resources being spent for conservation purposes on actions such as the acquisition and management of land, the rehabilitation of degraded habitats, and the purchase of easements from private landowners. Prioritising these actions is challenging due to the complexity of the problem and because there can be multiple actors undertaking conservation actions, often with divergent or partially overlapping objectives. We use a modelling framework to explore this issue with a study involving two agents sequentially purchasing land for conservation. We apply our model to simulated data using distributions taken from real data to simulate the cost of patches and the rarity and co-occurence of species. In our model each agent attempted to implement a conservation network that met its target for the minimum cost using the conservation planning software Marxan. We examine three scenarios where the conservation targets of the agents differ. The first scenario (called NGO-NGO) models the situation where two NGOs are both are targeting different sets of threatened species. The second and third scenarios (called NGO-Gov and Gov-NGO, respectively) represent a case where a government agency attempts to implement a complementary conservation network representing all species, while an NGO is focused on achieving additional protection for the most endangered species. For each of these scenarios we examined three types of interactions between agents: i) acting in isolation where the agents are attempting to achieve their targets solely though their own actions ii) sharing information where each agent is aware of the species representation achieved within the other agent’s conservation network and, iii) pooling resources where agents combine their resources and undertake conservation actions as a single entity. The latter two interactions represent different types of collaborations and in each scenario we determine the cost savings from sharing information or pooling resources. In each case we examined the utility of these interactions from the viewpoint of the combined conservation network resulting from both agents' actions, as well as from each agent’s individual perspective. The costs for each agent to achieve their objectives varied depending on the order in which the agents acted, the type of interaction between agents, and the specific goals of each agent. There were significant cost savings from increased collaboration via sharing information in the NGO-NGO scenario were the agent’s representation goals were mutually exclusive (in terms of specie targeted). In the NGO-Gov and Gov-NGO scenarios, collaboration generated much smaller savings. If the two agents collaborate by pooling resources there are multiple ways the total cost could be shared between both agents. For each scenario we investigate the costs and benefits for all possible cost sharing proportions. We find that there are a range of cost sharing proportions where both agents can benefit in the NGO-NGO scenarios while the NGO-Gov and Gov-NGO scenarios again showed little benefit. Although the model presented here has a range of simplifying assumptions, it demonstrates that the value of collaboration can vary significantly in different situations. In most cases, collaborating would have associated costs and these costs need to be weighed against the potential benefits from collaboration. The model demonstrates a method for determining the range of collaboration costs that would result in collaboration providing an efficient use of scarce conservation resources.
Resumo:
This dissertation investigates the very important and current problem of modelling human expertise. This is an apparent issue in any computer system emulating human decision making. It is prominent in Clinical Decision Support Systems (CDSS) due to the complexity of the induction process and the vast number of parameters in most cases. Other issues such as human error and missing or incomplete data present further challenges. In this thesis, the Galatean Risk Screening Tool (GRiST) is used as an example of modelling clinical expertise and parameter elicitation. The tool is a mental health clinical record management system with a top layer of decision support capabilities. It is currently being deployed by several NHS mental health trusts across the UK. The aim of the research is to investigate the problem of parameter elicitation by inducing them from real clinical data rather than from the human experts who provided the decision model. The induced parameters provide an insight into both the data relationships and how experts make decisions themselves. The outcomes help further understand human decision making and, in particular, help GRiST provide more accurate emulations of risk judgements. Although the algorithms and methods presented in this dissertation are applied to GRiST, they can be adopted for other human knowledge engineering domains.
Resumo:
Designers of self-adaptive systems often formulate adaptive design decisions, making unrealistic or myopic assumptions about the system's requirements and environment. The decisions taken during this formulation are crucial for satisfying requirements. In environments which are characterized by uncertainty and dynamism, deviation from these assumptions is the norm and may trigger 'surprises'. Our method allows designers to make explicit links between the possible emergence of surprises, risks and design trade-offs. The method can be used to explore the design decisions for self-adaptive systems and choose among decisions that better fulfil (or rather partially fulfil) non-functional requirements and address their trade-offs. The analysis can also provide designers with valuable input for refining the adaptation decisions to balance, for example, resilience (i.e. Satisfiability of non-functional requirements and their trade-offs) and stability (i.e. Minimizing the frequency of adaptation). The objective is to provide designers of self adaptive systems with a basis for multi-dimensional what-if analysis to revise and improve the understanding of the environment and its effect on non-functional requirements and thereafter decision-making. We have applied the method to a wireless sensor network for flood prediction. The application shows that the method gives rise to questions that were not explicitly asked before at design-time and assists designers in the process of risk-aware, what-if and trade-off analysis.
Resumo:
The loss of habitat and biodiversity worldwide has led to considerable resources being spent on conservation interventions. Prioritising these actions is challenging due to the complexity of the problem and because there can be multiple actors undertaking conservation actions, often with divergent or partially overlapping objectives. We explore this issue with a simulation study involving two agents sequentially purchasing land for the conservation of multiple species using three scenarios comprising either divergent or partially overlapping objectives between the agents. The first scenario investigates the situation where both agents are targeting different sets of threatened species. The second and third scenarios represent a case where a government agency attempts to implement a complementary conservation network representing 200 species, while a non-government organisation is focused on achieving additional protection for the ten rarest species. Simulated input data was generated using distributions taken from real data to model the cost of parcels, and the rarity and co-occurrence of species. We investigated three types of collaborative interactions between agents: acting in isolation, sharing information and pooling resources with the third option resulting in the agents combining their resources and effectively acting as a single entity. In each scenario we determine the cost savings when an agent moves from acting in isolation to either sharing information or pooling resources with the other agent. The model demonstrates how the value of collaboration can vary significantly in different situations. In most cases, collaborating would have associated costs and these costs need to be weighed against the potential benefits from collaboration. Our model demonstrates a method for determining the range of costs that would result in collaboration providing an efficient use of scarce conservation resources.