42 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The re-entrant flow shop scheduling problem (RFSP) is regarded as a NP-hard problem and attracted the attention of both researchers and industry. Current approach attempts to minimize the makespan of RFSP without considering the interdependency between the resource constraints and the re-entrant probability. This paper proposed Multi-level genetic algorithm (GA) by including the co-related re-entrant possibility and production mode in multi-level chromosome encoding. Repair operator is incorporated in the Multi-level genetic algorithm so as to revise the infeasible solution by resolving the resource conflict. With the objective of minimizing the makespan, Multi-level genetic algorithm (GA) is proposed and ANOVA is used to fine tune the parameter setting of GA. The experiment shows that the proposed approach is more effective to find the near-optimal schedule than the simulated annealing algorithm for both small-size problem and large-size problem. © 2013 Published by Elsevier Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Using data from the 2004 wave of the Afrobarometer survey, this study examines correlates of household hardship in three countries of sub-Saharan Africa: Tanzania, Zambia, and Zimbabwe. Findings provide partial support for the hypothesized relationship. Specifically, poverty reduction initiatives and informal assistance are associated with reduced hardship while civic engagement is related to an increase in household hardship. We also note that certain demographic characteristics are linked to hardship. Policy and practice implications are suggested. © The Author(s) 2011.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reviews some basic issues and methods involved in using neural networks to respond in a desired fashion to a temporally-varying environment. Some popular network models and training methods are introduced. A speech recognition example is then used to illustrate the central difficulty of temporal data processing: learning to notice and remember relevant contextual information. Feedforward network methods are applicable to cases where this problem is not severe. The application of these methods are explained and applications are discussed in the areas of pure mathematics, chemical and physical systems, and economic systems. A more powerful but less practical algorithm for temporal problems, the moving targets algorithm, is sketched and discussed. For completeness, a few remarks are made on reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

FDI plays a key role in development, particularly in resource-constrained transition economies of Central and Eastern Europe with relatively low savings rates. Gains from technology transfer play a critical role in motivating FDI, yet potential for it may be hampered by a large technology gap between the source and host country. While the extent of this gap has traditionally been attributed to education, skills and capital intensity, recent literature has also emphasized the possible role of institutional environment in this respect. Despite tremendous interest among policy-makers and academics to understand the factors attracting FDI (Bevan and Estrin, 2000; Globerman and Shapiro, 2003) our knowledge about the effects of institutions on the location choice and ownership structure of foreign firms remains limited. This paper attempts to fill this gap in the literature by examining the link between institutions and foreign ownership structures. To the best of our knowledge, Javorcik (2004) is the only papers, which use firm-level data to analyse the role of institutional quality on an outward investor’s entry mode in transition countries. Our paper extends Javorcik (2004) in a number of ways: (a) rather than a cross-section, we use panel data for the period 1997-2006; (b) rather than a binary variable, we use the percentage foreign ownership as continuous variable; (c) we consider multi-dimensional institutional variables, such as corruption, intellectual property rights protection and government stability. We also use factor analysis to generate a composite index of institutional quality and see how stronger institutional environment could affect foreign ownership; (d) we explore how the distance between institutional environment in source and host countries affect foreign ownership in a host country. The firm-level data used includes both domestic and foreign firms for the period 1997-2006 and is drawn from ORBIS, a commercially available dataset provided by Bureau van Dijk. In order to examine the link between institutions and foreign ownership structures, we estimate four log-linear ownership equations/specifications augmented by institutional and other control variables. We find evidence that the decision of a foreign firm to either locate its subsidiary or acquire an existing domestic firm depends not only on factor cost differences but also on differences in institutional environment between the host and source countries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examined whether the effectiveness of human resource management (HRM) practices is contingent on organizational climate and competitive strategy. The concepts of internal and external fit suggest that the positive relationship between HRM and subsequent productivity will be stronger for firms with a positive organizational climate and for firms using differentiation strategies. Resource allocation theories of motivation, on the other hand, predict that the relationship between HRM and productivity will be stronger for firms with a poor climate because employees working in these firms should have the greatest amount of spare capacity. The results supported the resource allocation argument. © 2005 Southern Management Association. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By utilizing the resource theory of social exchange (Foa & Foa, 1974), we attempted to cast light on the dynamics of the relationship between transformational-transactional leadership and employees' upward influence tactics. Using data collected in two time points (N=200, 1. year apart), we found perceptions of transformational leadership (Time 1) to be positively related to the use of soft and rational upward influence tactics (Time 2) whereas transactional leadership (Time 1) was positively related to the use of soft and hard upward influence tactics (Time 2). We also found support for a 3-way interaction between transformational-transactional leadership, relative Leader Member Exchanges (RLMX) and Perceived Organizational Support (POS) on employees' upward influence tactics. Specifically, in resource-constrained conditions (low RLMX and low POS), employees were likely to use soft tactics to influence a manager they perceived as transformational to a greater extent than in resource-munificent conditions. They were also likely to employ higher levels of soft and hard tactics to influence a transactional manager in resource-constrained rather than in resource-munificent conditions. © 2012 Elsevier Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Globally, more than 1000 tonnes of titanium (Ti) is implanted into patients in the form of biomedical devices on an annual basis. Ti is perceived to be ‘biocompatible’ owing to the presence of a robust passive oxide film (approx. 4 nm thick) at the metal surface. However, surface deterioration can lead to the release of Ti ions, and particles can arise as the result of wear and/or corrosion processes. This surface deterioration can result in peri-implant inflammation, leading to the premature loss of the implanted device or the requirement for surgical revision. Soft tissues surrounding commercially pure cranial anchorage devices (bone-anchored hearing aid) were investigated using synchrotron X-ray micro-fluorescence spectroscopy and X-ray absorption near edge structure. Here, we present the first experimental evidence that minimal load-bearing Ti implants, which are not subjected to macroscopic wear processes, can release Ti debris into the surrounding soft tissue. As such debris has been shown to be pro-inflammatory, we propose that such distributions of Ti are likely to effect to the service life of the device.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Internal Quantum Efficiency (IQE) of two-colour monolithic white light emitting diode (LED) was measured by temperature dependant electro-luminescence (TDEL) and analysed with modified rate equation based on ABC model. External, internal and injection efficiencies of blue and green quantum wells were analysed separately. Monolithic white LED contained one green InGaN QW and two blue QWs being separated by GaN barrier. This paper reports also the tunable behaviour of correlated colour temperature (CCT) in pulsed operation mode and effect of self-heating on device performance. © 2014 SPIE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dynamic Optimization Problems (DOPs) have been widely studied using Evolutionary Algorithms (EAs). Yet, a clear and rigorous definition of DOPs is lacking in the Evolutionary Dynamic Optimization (EDO) community. In this paper, we propose a unified definition of DOPs based on the idea of multiple-decision-making discussed in the Reinforcement Learning (RL) community. We draw a connection between EDO and RL by arguing that both of them are studying DOPs according to our definition of DOPs. We point out that existing EDO or RL research has been mainly focused on some types of DOPs. A conceptualized benchmark problem, which is aimed at the systematic study of various DOPs, is then developed. Some interesting experimental studies on the benchmark reveal that EDO and RL methods are specialized in certain types of DOPs and more importantly new algorithms for DOPs can be developed by combining the strength of both EDO and RL methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a new technique for optimizing the trading strategy of brokers that autonomously trade in re- tail and wholesale markets. Simultaneous optimization of re- tail and wholesale strategies has been considered by existing studies as intractable. Therefore, each of these strategies is optimized separately and their interdependence is generally ignored, with resulting broker agents not aiming for a glob- ally optimal retail and wholesale strategy. In this paper, we propose a novel formalization, based on a semi-Markov deci- sion process (SMDP), which globally and simultaneously op- timizes retail and wholesale strategies. The SMDP is solved using hierarchical reinforcement learning (HRL) in multi- agent environments. To address the curse of dimensionality, which arises when applying SMDP and HRL to complex de- cision problems, we propose an ecient knowledge transfer approach. This enables the reuse of learned trading skills in order to speed up the learning in new markets, at the same time as making the broker transportable across market envi- ronments. The proposed SMDP-broker has been thoroughly evaluated in two well-established multi-agent simulation en- vironments within the Trading Agent Competition (TAC) community. Analysis of controlled experiments shows that this broker can outperform the top TAC-brokers. More- over, our broker is able to perform well in a wide range of environments by re-using knowledge acquired in previously experienced settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Smart grid technologies have given rise to a liberalised and decentralised electricity market, enabling energy providers and retailers to have a better understanding of the demand side and its response to pricing signals. This paper puts forward a reinforcement-learning-powered tool aiding an electricity retailer to define the tariff prices it offers, in a bid to optimise its retail strategy. In a competitive market, an energy retailer aims to simultaneously increase the number of contracted customers and its profit margin. We have abstracted the problem of deciding on a tariff price as faced by a retailer, as a semi-Markov decision problem (SMDP). A hierarchical reinforcement learning approach, MaxQ value function decomposition, is applied to solve the SMDP through interactions with the market. To evaluate our trading strategy, we developed a retailer agent (termed AstonTAC) that uses the proposed SMDP framework to act in an open multi-agent simulation environment, the Power Trading Agent Competition (Power TAC). An evaluation and analysis of the 2013 Power TAC finals show that AstonTAC successfully selects sell prices that attract as many customers as necessary to maximise the profit margin. Moreover, during the competition, AstonTAC was the only retailer agent performing well across all retail market settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adaptive critic methods have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, nonlinear and nonstationary environments. In this study, a novel probabilistic dual heuristic programming (DHP) based adaptive critic controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) adaptive critic method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterized by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the critic network is then calculated and shown to be equal to the analytically derived correct value.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When visual sensor networks are composed of cameras which can adjust the zoom factor of their own lens, one must determine the optimal zoom levels for the cameras, for a given task. This gives rise to an important trade-off between the overlap of the different cameras’ fields of view, providing redundancy, and image quality. In an object tracking task, having multiple cameras observe the same area allows for quicker recovery, when a camera fails. In contrast having narrow zooms allow for a higher pixel count on regions of interest, leading to increased tracking confidence. In this paper we propose an approach for the self-organisation of redundancy in a distributed visual sensor network, based on decentralised multi-objective online learning using only local information to approximate the global state. We explore the impact of different zoom levels on these trade-offs, when tasking omnidirectional cameras, having perfect 360-degree view, with keeping track of a varying number of moving objects. We further show how employing decentralised reinforcement learning enables zoom configurations to be achieved dynamically at runtime according to an operator’s preference for maximising either the proportion of objects tracked, confidence associated with tracking, or redundancy in expectation of camera failure. We show that explicitly taking account of the level of overlap, even based only on local knowledge, improves resilience when cameras fail. Our results illustrate the trade-off between maintaining high confidence and object coverage, and maintaining redundancy, in anticipation of future failure. Our approach provides a fully tunable decentralised method for the self-organisation of redundancy in a changing environment, according to an operator’s preferences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel real-time power-device temperature estimation method that monitors the power MOSFET's junction temperature shift arising from thermal aging effects and incorporates the updated electrothermal models of power modules into digital controllers. Currently, the real-time estimator is emerging as an important tool for active control of device junction temperature as well as online health monitoring for power electronic systems, but its thermal model fails to address the device's ongoing degradation. Because of a mismatch of coefficients of thermal expansion between layers of power devices, repetitive thermal cycling will cause cracks, voids, and even delamination within the device components, particularly in the solder and thermal grease layers. Consequently, the thermal resistance of power devices will increase, making it possible to use thermal resistance (and junction temperature) as key indicators for condition monitoring and control purposes. In this paper, the predicted device temperature via threshold voltage measurements is compared with the real-time estimated ones, and the difference is attributed to the aging of the device. The thermal models in digital controllers are frequently updated to correct the shift caused by thermal aging effects. Experimental results on three power MOSFETs confirm that the proposed methodologies are effective to incorporate the thermal aging effects in the power-device temperature estimator with good accuracy. The developed adaptive technologies can be applied to other power devices such as IGBTs and SiC MOSFETs, and have significant economic implications.