7 resultados para REWARD
em Repositório Científico do Instituto Politécnico de Lisboa - Portugal
Resumo:
We have studied, in particular under normality of the implied random variables, the connections between different measures of risk such as the standard deviation, the W-ruin probability and the p-V@R. We discuss conditions granting the equivalence of these measures with respect to risk preference relations and the equivalence of dominance and efficiency of risk-reward criteria involving these measures. Then more specifically we applied these concepts to rigorously face the problem of finding the efficient set of de Finetti’s variable quota share proportional reinsurance.
Resumo:
Reinforcement Learning is an area of Machine Learning that deals with how an agent should take actions in an environment such as to maximize the notion of accumulated reward. This type of learning is inspired by the way humans learn and has led to the creation of various algorithms for reinforcement learning. These algorithms focus on the way in which an agent’s behaviour can be improved, assuming independence as to their surroundings. The current work studies the application of reinforcement learning methods to solve the inverted pendulum problem. The importance of the variability of the environment (factors that are external to the agent) on the execution of reinforcement learning agents is studied by using a model that seeks to obtain equilibrium (stability) through dynamism – a Cart-Pole system or inverted pendulum. We sought to improve the behaviour of the autonomous agents by changing the information passed to them, while maintaining the agent’s internal parameters constant (learning rate, discount factors, decay rate, etc.), instead of the classical approach of tuning the agent’s internal parameters. The influence of changes on the state set and the action set on an agent’s capability to solve the Cart-pole problem was studied. We have studied typical behaviour of reinforcement learning agents applied to the classic BOXES model and a new form of characterizing the environment was proposed using the notion of convergence towards a reference value. We demonstrate the gain in performance of this new method applied to a Q-Learning agent.
Resumo:
Mestrado em Controlo e Gestão dos Negócios
Resumo:
Mestrado em Contabilidade e Gestão das Instituições Financeiras
Resumo:
Mestrado em Controlo e gestão dos negócios
Resumo:
Dissertação apresentada à Escola Superior de Comunicação Social como parte dos requisitos para obtenção de grau de mestre em Gestão Estratégica das Relações Públicas.
Resumo:
Mestrado em Contabilidade e Gestão das Instituições Financeiras