174 resultados para Minimax-regret


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although many studies find that voting in Africa approximates an ethnic census in that voting is primarily along ethnic lines, hardly any of the studies have sought to explain ethnic voting following a rational choice framework. Using data of voter opinions from a survey conducted two weeks before the December 2007 Kenyan elections, we find that the expected benefits associated with a win by each of the presidential candidates varied significantly across voters from different ethnic groups. We hypothesize that decision to participate in the elections was influenced by the expected benefits as per the minimax-regret voting model. We test the predictions of this model using data of voter turnout in the December 2007 elections and find that turnout across ethnic groups varied systematically with expected benefits. The results suggest that individuals participated in the elections primarily to avoid the maximum regret should a candidate from another ethnic group win. The results therefore offer credence to the minimax regret model as proposed by Ferejohn and Fiorina (1974) and refute the Downsian expected utility model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem. We study this question both in the setting of prediction with expert advice, and for more general combinatorial decision tasks. We are not satisfied with just guaranteeing minimax regret rates, but we want our algorithms to perform significantly better on easy data. Two popular ways to formalize such adaptivity are second-order regret bounds and quantile bounds. The underlying notions of 'easy data', which may be paraphrased as "the learning problem has small variance" and "multiple decisions are useful", are synergetic. But even though there are sophisticated algorithms that exploit one of the two, no existing algorithm is able to adapt to both. In this paper we outline a new method for obtaining such adaptive algorithms, based on a potential function that aggregates a range of learning rates (which are essential tuning parameters). By choosing the right prior we construct efficient algorithms and show that they reap both benefits by proving the first bounds that are both second-order and incorporate quantiles.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Security is a critical concern around the world. Since resources for security are always limited, lots of interest have arisen in using game theory to handle security resource allocation problems. However, most of the existing work does not address adequately how a defender chooses his optimal strategy in a game with absent, inaccurate, uncertain, and even ambiguous strategy profiles' payoffs. To address this issue, we propose a general framework of security games under ambiguities based on Dempster-Shafer theory and the ambiguity aversion principle of minimax regret. Then, we reveal some properties of this framework. Also, we present two methods to reduce the influence of complete ignorance. Our investigation shows that this new framework is better in handling security resource allocation problems under ambiguities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Game-theoretic security resource allocation problems have generated significant interest in the area of designing and developing security systems. These approaches traditionally utilize the Stackelberg game model for security resource scheduling in order to improve the protection of critical assets. The basic assumption in Stackelberg games is that a defender will act first, then an attacker will choose their best response after observing the defender’s strategy commitment (e.g., protecting a specific asset). Thus, it requires an attacker’s full or partial observation of a defender’s strategy. This assumption is unrealistic in real-time threat recognition and prevention. In this paper, we propose a new solution concept (i.e., a method to predict how a game will be played) for deriving the defender’s optimal strategy based on the principle of acceptable costs of minimax regret. Moreover, we demonstrate the advantages of this solution concept by analyzing its properties.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The games-against-nature approach to the analysis of uncertainty in decision-making relies on the assumption that the behaviour of a decision-maker can be explained by concepts such as maximin, minimax regret, or a similarly defined criterion. In reality, however, these criteria represent a spectrum and, the actual behaviour of a decision-maker is most likely to embody a mixture of such idealisations. This paper proposes that in game-theoretic approach to decision-making under uncertainty, a more realistic representation of a decision-maker's behaviour can be achieved by synthesising games-against-nature with goal programming into a single framework. The proposed formulation is illustrated by using a well-known example from the literature on mathematical programming models for agricultural-decision-making. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We study the regret of optimal strategies for online convex optimization games. Using von Neumann's minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary's action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen's inequality for a concave functional--the minimizer over the player's actions of expected loss--defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary. Peter L. Bartlett, Alexander Rakhlin

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A number of learning problems can be cast as an Online Convex Game: on each round, a learner makes a prediction x from a convex set, the environment plays a loss function f, and the learner’s long-term goal is to minimize regret. Algorithms have been proposed by Zinkevich, when f is assumed to be convex, and Hazan et al., when f is assumed to be strongly convex, that have provably low regret. We consider these two settings and analyze such games from a minimax perspective, proving minimax strategies and lower bounds in each case. These results prove that the existing algorithms are essentially optimal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrate a modification of the algorithm of Dani et al for the online linear optimization problem in the bandit setting, which allows us to achieve an O( \sqrt{T ln T} ) regret bound in high probability against an adaptive adversary, as opposed to the in expectation result against an oblivious adversary of Dani et al. We obtain the same dependence on the dimension as that exhibited by Dani et al. The results of this paper rest firmly on those of Dani et al and the remarkable technique of Auer et al for obtaining high-probability bounds via optimistic estimates. This paper answers an open question: it eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). OLP uses its experience so far to estimate the MDP. It chooses actions by optimistically maximizing estimated future rewards over a set of next-state transition probabilities that are close to the estimates, a computation that corresponds to solving linear programs. We show that the total expected reward obtained by OLP up to time T is within C(P) log T of the reward obtained by the optimal policy, where C(P) is an explicit, MDP-dependent constant. OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities, the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm. OLP is also similar in flavor to an algorithm recently proposed by Auer and Ortner. But OLP is simpler and its regret bound has a better dependence on the size of the MDP.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a modification of the algorithm of Dani et al. [8] for the online linear optimization problem in the bandit setting, which with high probability has regret at most O ∗ ( √ T) against an adaptive adversary. This improves on the previous algorithm [8] whose regret is bounded in expectation against an oblivious adversary. We obtain the same dependence on the dimension (n 3/2) as that exhibited by Dani et al. The results of this paper rest firmly on those of [8] and the remarkable technique of Auer et al. [2] for obtaining high probability bounds via optimistic estimates. This paper answers an open question: it eliminates the gap between the high-probability bounds obtained in the full-information vs bandit settings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose Self-gifting is a performative process in which consumers purchase products for themselves. The literature to date remains silent on a determination and connection between the extents of post-purchase regret resulting from self-gifting behavior. The purpose of this paper is to examine identification and connection of self-gifting antecedents, self-gifting and the effect on post purchase regret. Design/methodology/approach This study claims the two antecedents of hedonistic shopping and indulgence drive self-gifting behaviors and the attendant regret. A total of 307 shoppers responded to a series of statements concerning the relationships between antecedents of self-gifting behavior and the effect on post-purchase regret. Self-gifting is a multi-dimensional construct, consisting of therapeutic, celebratory, reward and hedonistic imports. Confirmatory factor analysis and AMOS path modeling enabled examination of relationships between the consumer traits of hedonistic shopping and indulgence and the four self-gifting concepts. Findings Hedonic and indulgent shoppers engage in self-gifting for different reasons. A strong and positive relationship was identified between hedonic shoppers and reward, hedonic, therapeutic and celebratory self-gift motivations. hedonic shoppers aligned with indulgent shoppers who also engaged the four self-gifting concepts. The only regret concerning purchase of self-gifts was evident in the therapeutic and celebratory self-gift motivations. Research limitations/implications A major limitation was the age range specification of 18 to 45 years which meant the omission of older generations of regular and experienced shoppers. This study emphasizes the importance of variations in self-gift behaviors and of post-purchase consumer regret. Originality/value This research is the first examination of an hedonic attitude to shopping and indulgent antecedents to self-gift purchasing, the concepts of self-gift motivations and their effect on post-purchase regret.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performance guarantees for online learning algorithms typically take the form of regret bounds, which express that the cumulative loss overhead compared to the best expert in hindsight is small. In the common case of large but structured expert sets we typically wish to keep the regret especially small compared to simple experts, at the cost of modest additional overhead compared to more complex others. We study which such regret trade-offs can be achieved, and how. We analyse regret w.r.t. each individual expert as a multi-objective criterion in the simple but fundamental case of absolute loss. We characterise the achievable and Pareto optimal trade-offs, and the corresponding optimal strategies for each sample size both exactly for each finite horizon and asymptotically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The quick detection of an abrupt unknown change in the conditional distribution of a dependent stochastic process has numerous applications. In this paper, we pose a minimax robust quickest change detection problem for cases where there is uncertainty about the post-change conditional distribution. Our minimax robust formulation is based on the popular Lorden criteria of optimal quickest change detection. Under a condition on the set of possible post-change distributions, we show that the widely known cumulative sum (CUSUM) rule is asymptotically minimax robust under our Lorden minimax robust formulation as a false alarm constraint becomes more strict. We also establish general asymptotic bounds on the detection delay of misspecified CUSUM rules (i.e. CUSUM rules that are designed with post- change distributions that differ from those of the observed sequence). We exploit these bounds to compare the delay performance of asymptotically minimax robust, asymptotically optimal, and other misspecified CUSUM rules. In simulation examples, we illustrate that asymptotically minimax robust CUSUM rules can provide better detection delay performance at greatly reduced computation effort compared to competing generalised likelihood ratio procedures.