2 resultados para Reinforcement Learning,resource-constrained devices,iOS devices,on-device machine learning

em Repositório Institucional da Universidade de Aveiro - Portugal


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The digital revolution of the 21st century contributed to stem the Internet of Things (IoT). Trillions of embedded devices using the Internet Protocol (IP), also called smart objects, will be an integral part of the Internet. In order to support such an extremely large address space, a new Internet Protocol, called Internet Protocol Version 6 (IPv6) is being adopted. The IPv6 over Low Power Wireless Personal Area Networks (6LoWPAN) has accelerated the integration of WSNs into the Internet. At the same time, the Constrained Application Protocol (CoAP) has made it possible to provide resource constrained devices with RESTful Web services functionalities. This work builds upon previous experience in street lighting networks, for which a proprietary protocol, devised by the Lighting Living Lab, was implemented and used for several years. The proprietary protocol runs on a broad range of lighting control boards. In order to support heterogeneous applications with more demanding communication requirements and to improve the application development process, it was decided to port the Contiki OS to the four channel LED driver (4LD) board from Globaltronic. This thesis describes the work done to adapt the Contiki OS to support the Microchip TM PIC24FJ128GA308 microprocessor and presents an IP based solution to integrate sensors and actuators in smart lighting applications. Besides detailing the system’s architecture and implementation, this thesis presents multiple results showing that the performance of CoAP based resource retrievals in constrained nodes is adequate for supporting networking services in street lighting networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take advantage of the underlying episodic structure of collected batch of interactions. The Q-Batch update-rule is proposed in this thesis, to process experiencies along the trajectories collected in the interaction phase. This allows a faster propagation of obtained rewards and penalties, resulting in faster and more robust learning. Non-parametric function approximations are explored, such as Gaussian Processes. This type of approximators allows to encode prior knowledge about the latent function, in the form of kernels, providing a higher level of exibility and accuracy. The application of Gaussian Processes in Batch Reinforcement Learning presented a higher performance in learning tasks than other function approximations used in the literature. Lastly, in order to extract more information from the experiences collected by the agent, model-learning techniques are incorporated to learn the system dynamics. In this way, it is possible to augment the set of collected experiences with experiences generated through planning using the learned models. Experiments were carried out mainly in simulation, with some tests carried out in a physical robotic platform. The obtained results show that the proposed approaches are able to outperform the classical Fitted Q Iteration.