10 resultados para Regret.
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We construct an uncoupled randomized strategy of repeated play such that, if every player follows such a strategy, then the joint mixed strategy profiles converge, almost surely, to a Nash equilibrium of the one-shot game. The procedure requires very little in terms of players' information about the game. In fact, players' actions are based only on their own past payoffs and, in a variant of the strategy, players need not even know that their payoffs are determined through other players' actions. The procedure works for general finite games and is based on appropriate modifications of a simple stochastic learningrule introduced by Foster and Young.
Resumo:
This paper introduces a new solution concept, a minimax regret equilibrium, which allows for the possibility that players are uncertain about the rationality and conjectures of their opponents. We provide several applications of our concept. In particular, we consider pricesetting environments and show that optimal pricing policy follows a non-degenerate distribution. The induced price dispersion is consistent with experimental and empirical observations (Baye and Morgan (2004)).
Resumo:
[cat] En aquest article investiguem els factors que porten a universitaris espanyols i holandesos a lamentar els estudis cursats. Espanya i Holanda tenen un sistema educatiu molt diferent en termes de la rigidesa de l’educació secundària i el vincle entre l’educació i el mercat laboral. Comparant Espanya i Holanda ens permet aprendre sobre les conseqüències de dos sistemes educatius molt diferenciats a la probabilitat de lamentar els estudis cursats. Basant-nos en la literatura psicològica sobre l’arrepentiment/lamentació, derivem unes hipòtesis de partida que contrastem empíricament. Els resultats mostren que tant la rigidesa de l’educació secundària com el desajustament entre educació i ocupació són factors importants per explicar la lamentació dels estudis universitaris cursats. L’article conclou amb recomenacions sobre el sistema educatiu universitari.
Resumo:
[cat] En aquest article investiguem els factors que porten a universitaris espanyols i holandesos a lamentar els estudis cursats. Espanya i Holanda tenen un sistema educatiu molt diferent en termes de la rigidesa de l’educació secundària i el vincle entre l’educació i el mercat laboral. Comparant Espanya i Holanda ens permet aprendre sobre les conseqüències de dos sistemes educatius molt diferenciats a la probabilitat de lamentar els estudis cursats. Basant-nos en la literatura psicològica sobre l’arrepentiment/lamentació, derivem unes hipòtesis de partida que contrastem empíricament. Els resultats mostren que tant la rigidesa de l’educació secundària com el desajustament entre educació i ocupació són factors importants per explicar la lamentació dels estudis universitaris cursats. L’article conclou amb recomenacions sobre el sistema educatiu universitari.
Resumo:
This paper determines the effects of post-trade opaqueness on market performance. We find that the degree of market transparency has important effects on market equilibria. In particular, we show that dealers operating in a transparent structure set regret-free prices at each period making zero expected profits in each of the two trading rounds, whereas in the opaque market dealers invest in acquiring information at the beginning of the trading day. Moreover, we obtain that if there is no trading activity in the first period, then market makers only change their quotes in the opaque market. Additionally, we show that trade disclosure increases the informational efficiency of transaction prices and reduces volatility. Finally, concerning welfare of market participants, we obtain ambiguous results. Keywords: Market microstructure, Post-trade transparency, Price experimentation, Price dispersion.
Resumo:
We obtain minimax lower bounds on the regret for the classicaltwo--armed bandit problem. We provide a finite--sample minimax version of the well--known log $n$ asymptotic lower bound of Lai and Robbins. Also, in contrast to the log $n$ asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for {\sl every} allocation rule and for {\sl every} $n$, there is a configuration such that the regret at time $n$ is at least 1 -- $\epsilon$ times the regret of random guessing, where $\epsilon$ is any small positive constant.
Resumo:
We exhibit and characterize an entire class of simple adaptive strategies,in the repeated play of a game, having the Hannan-consistency property: In the long-run, the player is guaranteed an average payoff as large as the best-reply payoff to the empirical distribution of play of the otherplayers; i.e., there is no "regret." Smooth fictitious play (Fudenberg and Levine [1995]) and regret-matching (Hart and Mas-Colell [1998]) areparticular cases. The motivation and application of this work come from the study of procedures whose empirical distribution of play is, in thelong-run, (almost) a correlated equilibrium. The basic tool for the analysis is a generalization of Blackwell's [1956a] approachability strategy for games with vector payoffs.
Resumo:
We propose a simple adaptive procedure for playing a game. In thisprocedure, players depart from their current play with probabilities thatare proportional to measures of regret for not having used other strategies(these measures are updated every period). It is shown that our adaptiveprocedure guaranties that with probability one, the sample distributionsof play converge to the set of correlated equilibria of the game. Tocompute these regret measures, a player needs to know his payoff functionand the history of play. We also offer a variation where every playerknows only his own realized payoff history (but not his payoff function).
Resumo:
We investigate on-line prediction of individual sequences. Given a class of predictors, the goal is to predict as well as the best predictor in the class, where the loss is measured by the self information (logarithmic) loss function. The excess loss (regret) is closely related to the redundancy of the associated lossless universal code. Using Shtarkov's theorem and tools from empirical process theory, we prove a general upper bound on the best possible (minimax) regret. The bound depends on certain metric properties of the class of predictors. We apply the bound to both parametric and nonparametric classes ofpredictors. Finally, we point out a suboptimal behavior of the popular Bayesian weighted average algorithm.
Resumo:
We consider an agent who has to repeatedly make choices in an uncertainand changing environment, who has full information of the past, who discountsfuture payoffs, but who has no prior. We provide a learning algorithm thatperforms almost as well as the best of a given finite number of experts orbenchmark strategies and does so at any point in time, provided the agentis sufficiently patient. The key is to find the appropriate degree of forgettingdistant past. Standard learning algorithms that treat recent and distant pastequally do not have the sequential epsilon optimality property.