3 resultados para Policy Learning

em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

On-line learning methods have been applied successfully in multi-agent systems to achieve coordination among agents. Learning in multi-agent systems implies in a non-stationary scenario perceived by the agents, since the behavior of other agents may change as they simultaneously learn how to improve their actions. Non-stationary scenarios can be modeled as Markov Games, which can be solved using the Minimax-Q algorithm a combination of Q-learning (a Reinforcement Learning (RL) algorithm which directly learns an optimal control policy) and the Minimax algorithm. However, finding optimal control policies using any RL algorithm (Q-learning and Minimax-Q included) can be very time consuming. Trying to improve the learning time of Q-learning, we considered the QS-algorithm. in which a single experience can update more than a single action value by using a spreading function. In this paper, we contribute a Minimax-QS algorithm which combines the Minimax-Q algorithm and the QS-algorithm. We conduct a series of empirical evaluation of the algorithm in a simplified simulator of the soccer domain. We show that even using a very simple domain-dependent spreading function, the performance of the learning algorithm can be improved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this article is to discuss whether public procurement policy can promote innovation by firms located in developing countries. The literature on technological learning is used to create a typology for assessing the impact of public procurement in developing countries from the standpoint of innovation. Petrobras, a Brazilian state-owned enterprise, was chosen as a case study. Petrobras is a global leader in the field of deepwater oil production technology and so offers an interesting opportunity to investigate whether government procurement in developing countries is used to promote the capability of domestic firms to develop innovations. The article presents the findings of a field survey on P-51, a platform that was ordered by the Brazilian state-owned enterprise and began producing in 2009. The case study is based on information collected from interviews with managers of Petrobras, EPC contractors and some of the firms subcontracted to work on P-51.