905 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Securing group communication in wireless sensor networks has recently been extensively investigated. Many works have addressed this issue, and they have considered the grouping concept differently. In this paper, we consider a group as being a set of nodes sensing the same data type, and we alternatively propose an efficient secure group communication scheme guaranteeing secure group management and secure group key distribution. The proposed scheme (RiSeG) is based on a logical ring architecture, which permits to alleviate the group controller’s task in updating the group key. The proposed scheme also provides backward and forward secrecy, addresses the node compromise attack, and gives a solution to detect and eliminate the compromised nodes. The security analysis and performance evaluation show that the proposed scheme is secure, highly efficient, and lightweight. A comparison with the logical key hierarchy is preformed to prove the rekeying process efficiency of RiSeG. Finally, we present the implementation details of RiSeG on top of TelosB sensor nodes to demonstrate its feasibility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a genetic algorithm for the resource constrained multi-project scheduling problem. The chromosome representation of the problem is based on random keys. The schedules are constructed using a heuristic that builds parameterized active schedules based on priorities, delay times, and release dates defined by the genetic algorithm. The approach is tested on a set of randomly generated problems. The computational results validate the effectiveness of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a biased random-key genetic algorithm for the resource constrained project scheduling problem. The chromosome representation of the problem is based on random keys. Active schedules are constructed using a priority-rule heuristic in which the priorities of the activities are defined by the genetic algorithm. A forward-backward improvement procedure is applied to all solutions. The chromosomes supplied by the genetic algorithm are adjusted to reflect the solutions obtained by the improvement procedure. The heuristic is tested on a set of standard problems taken from the literature and compared with other approaches. The computational results validate the effectiveness of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a genetic algorithm for the Resource Constrained Project Scheduling Problem (RCPSP). The chromosome representation of the problem is based on random keys. The schedule is constructed using a heuristic priority rule in which the priorities of the activities are defined by the genetic algorithm. The heuristic generates parameterized active schedules. The approach was tested on a set of standard problems taken from the literature and compared with other approaches. The computational results validate the effectiveness of the proposed algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a hybrid coordination method for behavior-based control architectures. The hybrid method takes advantages of the robustness and modularity in competitive approaches as well as optimized trajectories in cooperative ones. This paper shows the feasibility of applying this hybrid method with a 3D-navigation to an autonomous underwater vehicle (AUV). The behaviors are learnt online by means of reinforcement learning. A continuous Q-learning implemented with a feed-forward neural network is employed. Realistic simulations were carried out. The results obtained show the good performance of the hybrid method on behavior coordination as well as the convergence of the behaviors

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a hybrid behavior-based scheme using reinforcement learning for high-level control of autonomous underwater vehicles (AUVs). Two main features of the presented approach are hybrid behavior coordination and semi on-line neural-Q_learning (SONQL). Hybrid behavior coordination takes advantages of robustness and modularity in the competitive approach as well as efficient trajectories in the cooperative approach. SONQL, a new continuous approach of the Q_learning algorithm with a multilayer neural network is used to learn behavior state/action mapping online. Experimental results show the feasibility of the presented approach for AUVs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a field application of a high-level reinforcement learning (RL) control system for solving the action selection problem of an autonomous robot in cable tracking task. The learning system is characterized by using a direct policy search method for learning the internal state/action mapping. Policy only algorithms may suffer from long convergence times when dealing with real robotics. In order to speed up the process, the learning phase has been carried out in a simulated environment and, in a second step, the policy has been transferred and tested successfully on a real robot. Future steps plan to continue the learning process on-line while on the real robot while performing the mentioned task. We demonstrate its feasibility with real experiments on the underwater robot ICTINEU AUV

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract This thesis proposes a set of adaptive broadcast solutions and an adaptive data replication solution to support the deployment of P2P applications. P2P applications are an emerging type of distributed applications that are running on top of P2P networks. Typical P2P applications are video streaming, file sharing, etc. While interesting because they are fully distributed, P2P applications suffer from several deployment problems, due to the nature of the environment on which they perform. Indeed, defining an application on top of a P2P network often means defining an application where peers contribute resources in exchange for their ability to use the P2P application. For example, in P2P file sharing application, while the user is downloading some file, the P2P application is in parallel serving that file to other users. Such peers could have limited hardware resources, e.g., CPU, bandwidth and memory or the end-user could decide to limit the resources it dedicates to the P2P application a priori. In addition, a P2P network is typically emerged into an unreliable environment, where communication links and processes are subject to message losses and crashes, respectively. To support P2P applications, this thesis proposes a set of services that address some underlying constraints related to the nature of P2P networks. The proposed services include a set of adaptive broadcast solutions and an adaptive data replication solution that can be used as the basis of several P2P applications. Our data replication solution permits to increase availability and to reduce the communication overhead. The broadcast solutions aim, at providing a communication substrate encapsulating one of the key communication paradigms used by P2P applications: broadcast. Our broadcast solutions typically aim at offering reliability and scalability to some upper layer, be it an end-to-end P2P application or another system-level layer, such as a data replication layer. Our contributions are organized in a protocol stack made of three layers. In each layer, we propose a set of adaptive protocols that address specific constraints imposed by the environment. Each protocol is evaluated through a set of simulations. The adaptiveness aspect of our solutions relies on the fact that they take into account the constraints of the underlying system in a proactive manner. To model these constraints, we define an environment approximation algorithm allowing us to obtain an approximated view about the system or part of it. This approximated view includes the topology and the components reliability expressed in probabilistic terms. To adapt to the underlying system constraints, the proposed broadcast solutions route messages through tree overlays permitting to maximize the broadcast reliability. Here, the broadcast reliability is expressed as a function of the selected paths reliability and of the use of available resources. These resources are modeled in terms of quotas of messages translating the receiving and sending capacities at each node. To allow a deployment in a large-scale system, we take into account the available memory at processes by limiting the view they have to maintain about the system. Using this partial view, we propose three scalable broadcast algorithms, which are based on a propagation overlay that tends to the global tree overlay and adapts to some constraints of the underlying system. At a higher level, this thesis also proposes a data replication solution that is adaptive both in terms of replica placement and in terms of request routing. At the routing level, this solution takes the unreliability of the environment into account, in order to maximize reliable delivery of requests. At the replica placement level, the dynamically changing origin and frequency of read/write requests are analyzed, in order to define a set of replica that minimizes communication cost.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many species are able to learn to associate behaviours with rewards as this gives fitness advantages in changing environments. Social interactions between population members may, however, require more cognitive abilities than simple trial-and-error learning, in particular the capacity to make accurate hypotheses about the material payoff consequences of alternative action combinations. It is unclear in this context whether natural selection necessarily favours individuals to use information about payoffs associated with nontried actions (hypothetical payoffs), as opposed to simple reinforcement of realized payoff. Here, we develop an evolutionary model in which individuals are genetically determined to use either trial-and-error learning or learning based on hypothetical reinforcements, and ask what is the evolutionarily stable learning rule under pairwise symmetric two-action stochastic repeated games played over the individual's lifetime. We analyse through stochastic approximation theory and simulations the learning dynamics on the behavioural timescale, and derive conditions where trial-and-error learning outcompetes hypothetical reinforcement learning on the evolutionary timescale. This occurs in particular under repeated cooperative interactions with the same partner. By contrast, we find that hypothetical reinforcement learners tend to be favoured under random interactions, but stable polymorphisms can also obtain where trial-and-error learners are maintained at a low frequency. We conclude that specific game structures can select for trial-and-error learning even in the absence of costs of cognition, which illustrates that cost-free increased cognition can be counterselected under social interactions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One major component of power system operation is generation scheduling. The objective of the work is to develop efficient control strategies to the power scheduling problems through Reinforcement Learning approaches. The three important active power scheduling problems are Unit Commitment, Economic Dispatch and Automatic Generation Control. Numerical solution methods proposed for solution of power scheduling are insufficient in handling large and complex systems. Soft Computing methods like Simulated Annealing, Evolutionary Programming etc., are efficient in handling complex cost functions, but find limitation in handling stochastic data existing in a practical system. Also the learning steps are to be repeated for each load demand which increases the computation time.Reinforcement Learning (RL) is a method of learning through interactions with environment. The main advantage of this approach is it does not require a precise mathematical formulation. It can learn either by interacting with the environment or interacting with a simulation model. Several optimization and control problems have been solved through Reinforcement Learning approach. The application of Reinforcement Learning in the field of Power system has been a few. The objective is to introduce and extend Reinforcement Learning approaches for the active power scheduling problems in an implementable manner. The main objectives can be enumerated as:(i) Evolve Reinforcement Learning based solutions to the Unit Commitment Problem.(ii) Find suitable solution strategies through Reinforcement Learning approach for Economic Dispatch. (iii) Extend the Reinforcement Learning solution to Automatic Generation Control with a different perspective. (iv) Check the suitability of the scheduling solutions to one of the existing power systems.First part of the thesis is concerned with the Reinforcement Learning approach to Unit Commitment problem. Unit Commitment Problem is formulated as a multi stage decision process. Q learning solution is developed to obtain the optimwn commitment schedule. Method of state aggregation is used to formulate an efficient solution considering the minimwn up time I down time constraints. The performance of the algorithms are evaluated for different systems and compared with other stochastic methods like Genetic Algorithm.Second stage of the work is concerned with solving Economic Dispatch problem. A simple and straight forward decision making strategy is first proposed in the Learning Automata algorithm. Then to solve the scheduling task of systems with large number of generating units, the problem is formulated as a multi stage decision making task. The solution obtained is extended in order to incorporate the transmission losses in the system. To make the Reinforcement Learning solution more efficient and to handle continuous state space, a fimction approximation strategy is proposed. The performance of the developed algorithms are tested for several standard test cases. Proposed method is compared with other recent methods like Partition Approach Algorithm, Simulated Annealing etc.As the final step of implementing the active power control loops in power system, Automatic Generation Control is also taken into consideration.Reinforcement Learning has already been applied to solve Automatic Generation Control loop. The RL solution is extended to take up the approach of common frequency for all the interconnected areas, more similar to practical systems. Performance of the RL controller is also compared with that of the conventional integral controller.In order to prove the suitability of the proposed methods to practical systems, second plant ofNeyveli Thennal Power Station (NTPS IT) is taken for case study. The perfonnance of the Reinforcement Learning solution is found to be better than the other existing methods, which provide the promising step towards RL based control schemes for practical power industry.Reinforcement Learning is applied to solve the scheduling problems in the power industry and found to give satisfactory perfonnance. Proposed solution provides a scope for getting more profit as the economic schedule is obtained instantaneously. Since Reinforcement Learning method can take the stochastic cost data obtained time to time from a plant, it gives an implementable method. As a further step, with suitable methods to interface with on line data, economic scheduling can be achieved instantaneously in a generation control center. Also power scheduling of systems with different sources such as hydro, thermal etc. can be looked into and Reinforcement Learning solutions can be achieved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reinforcement Learning (RL) refers to a class of learning algorithms in which learning system learns which action to take in different situations by using a scalar evaluation received from the environment on performing an action. RL has been successfully applied to many multi stage decision making problem (MDP) where in each stage the learning systems decides which action has to be taken. Economic Dispatch (ED) problem is an important scheduling problem in power systems, which decides the amount of generation to be allocated to each generating unit so that the total cost of generation is minimized without violating system constraints. In this paper we formulate economic dispatch problem as a multi stage decision making problem. In this paper, we also develop RL based algorithm to solve the ED problem. The performance of our algorithm is compared with other recent methods. The main advantage of our method is it can learn the schedule for all possible demands simultaneously.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unit commitment is an optimization task in electric power generation control sector. It involves scheduling the ON/OFF status of the generating units to meet the load demand with minimum generation cost satisfying the different constraints existing in the system. Numerical solutions developed are limited for small systems and heuristic methodologies find difficulty in handling stochastic cost functions associated with practical systems. This paper models Unit Commitment as a multi stage decision task and Reinforcement Learning solution is formulated through one efficient exploration strategy: Pursuit method. The correctness and efficiency of the developed solutions are verified for standard test systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the vision of Mark Weiser on ubiquitous computing, computers are disappearing from the focus of the users and are seamlessly interacting with other computers and users in order to provide information and services. This shift of computers away from direct computer interaction requires another way of applications to interact without bothering the user. Context is the information which can be used to characterize the situation of persons, locations, or other objects relevant for the applications. Context-aware applications are capable of monitoring and exploiting knowledge about external operating conditions. These applications can adapt their behaviour based on the retrieved information and thus to replace (at least a certain amount) the missing user interactions. Context awareness can be assumed to be an important ingredient for applications in ubiquitous computing environments. However, context management in ubiquitous computing environments must reflect the specific characteristics of these environments, for example distribution, mobility, resource-constrained devices, and heterogeneity of context sources. Modern mobile devices are equipped with fast processors, sufficient memory, and with several sensors, like Global Positioning System (GPS) sensor, light sensor, or accelerometer. Since many applications in ubiquitous computing environments can exploit context information for enhancing their service to the user, these devices are highly useful for context-aware applications in ubiquitous computing environments. Additionally, context reasoners and external context providers can be incorporated. It is possible that several context sensors, reasoners and context providers offer the same type of information. However, the information providers can differ in quality levels (e.g. accuracy), representations (e.g. position represented in coordinates and as an address) of the offered information, and costs (like battery consumption) for providing the information. In order to simplify the development of context-aware applications, the developers should be able to transparently access context information without bothering with underlying context accessing techniques and distribution aspects. They should rather be able to express which kind of information they require, which quality criteria this information should fulfil, and how much the provision of this information should cost (not only monetary cost but also energy or performance usage). For this purpose, application developers as well as developers of context providers need a common language and vocabulary to specify which information they require respectively they provide. These descriptions respectively criteria have to be matched. For a matching of these descriptions, it is likely that a transformation of the provided information is needed to fulfil the criteria of the context-aware application. As it is possible that more than one provider fulfils the criteria, a selection process is required. In this process the system has to trade off the provided quality of context and required costs of the context provider against the quality of context requested by the context consumer. This selection allows to turn on context sources only if required. Explicitly selecting context services and thereby dynamically activating and deactivating the local context provider has the advantage that also the resource consumption is reduced as especially unused context sensors are deactivated. One promising solution is a middleware providing appropriate support in consideration of the principles of service-oriented computing like loose coupling, abstraction, reusability, or discoverability of context providers. This allows us to abstract context sensors, context reasoners and also external context providers as context services. In this thesis we present our solution consisting of a context model and ontology, a context offer and query language, a comprehensive matching and mediation process and a selection service. Especially the matching and mediation process and the selection service differ from the existing works. The matching and mediation process allows an autonomous establishment of mediation processes in order to transfer information from an offered representation into a requested representation. In difference to other approaches, the selection service selects not only a service for a service request, it rather selects a set of services in order to fulfil all requests which also facilitates the sharing of services. The approach is extensively reviewed regarding the different requirements and a set of demonstrators shows its usability in real-world scenarios.