137 resultados para Regret


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary. Peter L. Bartlett, Alexander Rakhlin

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrate a modification of the algorithm of Dani et al for the online linear optimization problem in the bandit setting, which allows us to achieve an O( \sqrt{T ln T} ) regret bound in high probability against an adaptive adversary, as opposed to the in expectation result against an oblivious adversary of Dani et al. We obtain the same dependence on the dimension as that exhibited by Dani et al. The results of this paper rest firmly on those of Dani et al and the remarkable technique of Auer et al for obtaining high-probability bounds via optimistic estimates. This paper answers an open question: it eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). OLP uses its experience so far to estimate the MDP. It chooses actions by optimistically maximizing estimated future rewards over a set of next-state transition probabilities that are close to the estimates, a computation that corresponds to solving linear programs. We show that the total expected reward obtained by OLP up to time T is within C(P) log T of the reward obtained by the optimal policy, where C(P) is an explicit, MDP-dependent constant. OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities, the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm. OLP is also similar in flavor to an algorithm recently proposed by Auer and Ortner. But OLP is simpler and its regret bound has a better dependence on the size of the MDP.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a modification of the algorithm of Dani et al. [8] for the online linear optimization problem in the bandit setting, which with high probability has regret at most O ∗ ( √ T) against an adaptive adversary. This improves on the previous algorithm [8] whose regret is bounded in expectation against an oblivious adversary. We obtain the same dependence on the dimension (n 3/2) as that exhibited by Dani et al. The results of this paper rest firmly on those of [8] and the remarkable technique of Auer et al. [2] for obtaining high probability bounds via optimistic estimates. This paper answers an open question: it eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose Self-gifting is a performative process in which consumers purchase products for themselves. The literature to date remains silent on a determination and connection between the extents of post-purchase regret resulting from self-gifting behavior. The purpose of this paper is to examine identification and connection of self-gifting antecedents, self-gifting and the effect on post purchase regret. Design/methodology/approach This study claims the two antecedents of hedonistic shopping and indulgence drive self-gifting behaviors and the attendant regret. A total of 307 shoppers responded to a series of statements concerning the relationships between antecedents of self-gifting behavior and the effect on post-purchase regret. Self-gifting is a multi-dimensional construct, consisting of therapeutic, celebratory, reward and hedonistic imports. Confirmatory factor analysis and AMOS path modeling enabled examination of relationships between the consumer traits of hedonistic shopping and indulgence and the four self-gifting concepts. Findings Hedonic and indulgent shoppers engage in self-gifting for different reasons. A strong and positive relationship was identified between hedonic shoppers and reward, hedonic, therapeutic and celebratory self-gift motivations. hedonic shoppers aligned with indulgent shoppers who also engaged the four self-gifting concepts. The only regret concerning purchase of self-gifts was evident in the therapeutic and celebratory self-gift motivations. Research limitations/implications A major limitation was the age range specification of 18 to 45 years which meant the omission of older generations of regular and experienced shoppers. This study emphasizes the importance of variations in self-gift behaviors and of post-purchase consumer regret. Originality/value This research is the first examination of an hedonic attitude to shopping and indulgent antecedents to self-gift purchasing, the concepts of self-gift motivations and their effect on post-purchase regret.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performance guarantees for online learning algorithms typically take the form of regret bounds, which express that the cumulative loss overhead compared to the best expert in hindsight is small. In the common case of large but structured expert sets we typically wish to keep the regret especially small compared to simple experts, at the cost of modest additional overhead compared to more complex others. We study which such regret trade-offs can be achieved, and how. We analyse regret w.r.t. each individual expert as a multi-objective criterion in the simple but fundamental case of absolute loss. We characterise the achievable and Pareto optimal trade-offs, and the corresponding optimal strategies for each sample size both exactly for each finite horizon and asymptotically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of bipartite ranking, where instances are labeled positive or negative and the goal is to learn a scoring function that minimizes the probability of mis-ranking a pair of positive and negative instances (or equivalently, that maximizes the area under the ROC curve), has been widely studied in recent years. A dominant theoretical and algorithmic framework for the problem has been to reduce bipartite ranking to pairwise classification; in particular, it is well known that the bipartite ranking regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for classification problems. Recently, Kotlowski et al. (2011) showed regret bounds for bipartite ranking in terms of the regret associated with balanced versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we show that such (non-pairwise) surrogate regret bounds for bipartite ranking can be obtained in terms of a broad class of proper (composite) losses that we term as strongly proper. Our proof technique is much simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses as special cases. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact consistent for bipartite ranking; moreover, our results allow us to quantify the bipartite ranking regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate bounds under certain low-noise conditions via a recent result of Clemencon and Robbiano (2011).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We apply all autobiographical memory framework to the Study of regret. Focusing oil the distinction between regrets for specific and general events we argue that the temporal profile of regret, usually explained in terms of the action-inaction distinction, is predicted by models of autobiographical memory. In two studies involving Participants in their sixties we demonstrate a reminiscence bump for general, but not for specific regrets. Recent regrets were more likely to be specific than general in nature. Coding regrets as actions/inactions revealed that general regrets were significantly more likely to be due to inaction while specific regrets were as likely to be clue to action as to inaction. In Study 2 we also generalised all of these findings to a group of participants in their 40s. We re-interpret existing accounts of the temporal profile of regret within the autobiographical memory framework, and Outline the practical and theoretical advantages Of Our memory-based distinction over traditional decision-making approaches to the Study of regret. (C) 2008 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

People tend to attribute more regret to a character who has decided to take action and experienced a negative outcome than to one who has decided not to act and experienced a negative outcome. For some decisions, however, this finding is not observed in a between-participants design and thus appears to rely on comparisons between people's representations of action and their representations of inaction. In this article, we outline a mental models account that explains findings from studies that have used within- and between-participants designs, and we suggest that, for decisions with uncertain counterfactual outcomes, information about the consequences of a decision to act causes people to flesh out their representation of the counterfactual states of affairs for inaction. In three experiments, we confirm our predictions about participants' fleshing out of representations, demonstrating that an action effect occurs only when information about the consequences of action is available to participants as they rate the nonactor and when this information about action is informative with respect to judgments about inaction. It is important to note that the action effect always occurs when the decision scenario specifies certain counterfactual outcomes. These results suggest that people sometimes base their attributions of regret on comparisons among different sets of mental models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous accounts of regret suggest that people report greater regret for inaction than for action because the former is longer lasting and more painful than the latter. We suggest instead that the tendency for people's greatest regrets to concern inaction more than action may be due to the relatively self-enhancing nature of regrets for inaction. In Study I we asked people to think about their greatest recent regret and to code it as being due to action or inaction. In Study 2 participants described their greatest regret from across their entire life. In both studies we observed an inaction effect only amongst individuals high in self-esteem (HSE). In Study 2 we found that the inaction effect was confined to HSE people whose greatest regret was personal in nature. These results support the claim that regret for inaction is relatively self-enhancing and suggest that the inaction effect found in real-life regrets may be due, in part at least, to the self-enhancement goals of HSE individuals. Copyright (c) 2005 John Wiley & Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In two experiments, 4- to 9-year-olds played a game in which they
selected one of two boxes to win a prize. On regret trials the unchosen
box contained a better prize than the prize children actually
won, and on baseline trials the other box contained a prize of the
same value. Children rated their feelings about their prize before
and after seeing what they could have won if they had chosen
the other box and were asked to provide an explanation if their
feelings had changed. Patterns of responding suggested that regret
was experienced by 6 or 7 years of age; children of this age could
also explain why they felt worse in regret trials by referring to
the counterfactual situation in which the prize was better. No evidence
of regret was found in 4- and 5-year-olds. Additional findings
suggested that by 6 or 7 years, children’s emotions were
determined by a consideration of two different counterfactual
scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper compares the Random Regret Minimization and the Random Utility Maximization models for determining recreational choice. The Random Regret approach is based on the idea that, when choosing, individuals aim to minimize their regretregret being defined as what one experiences when a non-chosen alternative in a choice set performs better than a chosen one in relation to one or more attributes. The Random Regret paradigm, recently developed in transport economics, presents a tractable, regret-based alternative to the dominant choice paradigm based on Random Utility. Using data from a travel cost study exploring factors that influence kayakers’ site-choice decisions in the Republic of Ireland, we estimate both the traditional Random Utility multinomial logit model (RU-MNL) and the Random Regret multinomial logit model (RR-MNL) to gain more insights into site choice decisions. We further explore whether choices are driven by a utility maximization or a regret minimization paradigm by running a binary logit model to examine the likelihood of the two decision choice paradigms using site visits and respondents characteristics as explanatory variables. In addition to being one of the first studies to apply the RR-MNL to an environmental good, this paper also represents the first application of the RR-MNL to compute the Logsum to test and strengthen conclusions on welfare impacts of potential alternative policy scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces the discrete choice model-paradigm of Random Regret Minimization (RRM) to the field of environmental and resource economics. The RRM-approach has been very recently developed in the context of travel demand modelling and presents a tractable, regret-based alternative to the dominant choice-modelling paradigm based on Random Utility Maximization-theory (RUM-theory). We highlight how RRM-based models provide closed form, logit-type formulations for choice probabilities that allow for capturing semi-compensatory behaviour and choice set-composition effects while being equally parsimonious as their utilitarian counterparts. Using data from a Stated Choice-experiment aimed at identifying valuations of characteristics of nature parks, we compare RRM-based models and RUM-based models in terms of parameter estimates, goodness of fit, elasticities and consequential policy implications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces the discrete choice model-paradigm of Random Regret Minimisation (RRM) to the field of health economics. The RRM is a regret-based model that explores a driver of choice different from the traditional utility-based Random Utility Maximisation (RUM). The RRM approach is based on the idea that, when choosing, individuals aim to minimise their regretregret being defined as what one experiences when a non-chosen alternative in a choice set performs better than a chosen one in relation to one or more attributes. Analysing data from a discrete choice experiment on diet, physical activity and risk of a fatal heart attack in the next ten years administered to a sample of the Northern Ireland population, we find that the combined use of RUM and RRM models offer additional information, providing useful behavioural insights for better informed policy appraisal.