911 resultados para Scoring rules


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we study individual incentives to report preferences truthfully for the special case when individuals have dichotomous preferences on the set of alternatives and preferences are aggregated in form of scoring rules. In particular, we show that (a) the Borda Count coincides with Approval Voting on the dichotomous preference domain, (b) the Borda Count is the only strategy-proof scoring rule on the dichotomous preference domain, and (c) if at least three individuals participate in the election, then the dichotomous preference domain is the unique maximal rich domain under which the Borda Count is strategy-proof.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Scoring rules that elicit an entire belief distribution through the elicitation of point beliefsare time-consuming and demand considerable cognitive e¤ort. Moreover, the results are validonly when agents are risk-neutral or when one uses probabilistic rules. We investigate a classof rules in which the agent has to choose an interval and is rewarded (deterministically) onthe basis of the chosen interval and the realization of the random variable. We formulatean e¢ ciency criterion for such rules and present a speci.c interval scoring rule. For single-peaked beliefs, our rule gives information about both the location and the dispersion of thebelief distribution. These results hold for all concave utility functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are several scoring rules that one can choose from in order to score probabilistic forecasting models or estimate model parameters. Whilst it is generally agreed that proper scoring rules are preferable, there is no clear criterion for preferring one proper scoring rule above another. This manuscript compares and contrasts some commonly used proper scoring rules and provides guidance on scoring rule selection. In particular, it is shown that the logarithmic scoring rule prefers erring with more uncertainty, the spherical scoring rule prefers erring with lower uncertainty, whereas the other scoring rules are indifferent to either option.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the global construction context, the best value or most economically advantageous tender is becoming a widespread approach for contractor selection, as an alternative to other traditional awarding criteria such as the lowest price. In these multi-attribute tenders, the owner or auctioneer solicits proposals containing both a price bid and additional technical features. Once the proposals are received, each bidder’s price bid is given an economic score according to a scoring rule, generally called an economic scoring formula (ESF) and a technical score according to pre-specified criteria. Eventually, the contract is awarded to the bidder with the highest weighted overall score (economic + technical). However, economic scoring formula selection by auctioneers is invariably and paradoxically a highly intuitive process in practice, involving few theoretical or empirical considerations, despite having been considered traditionally and mistakenly as objective, due to its mathematical nature. This paper provides a taxonomic classification of a wide variety of ESFs and abnormally low bids criteria (ALBC) gathered in several countries with different tendering approaches. Practical implications concern the optimal design of price scoring rules in construction contract tenders, as well as future analyses of the effects of the ESF and ALBC on competitive bidding behaviour.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper examines the extent to which engineers can influence the competitive behavior of bidders in Best Value or multi-attribute construction auctions, where both the (dollar) bid and technical non-price criteria are scored according to a scoring rule. From a sample of Spanish construction auctions with a variety of bid scoring rules, it is found that bidders are influenced by the auction rules in significant and predictable ways. The bid score weighting, bid scoring formula and abnormally low bid criterion are variables likely to influence the competitiveness of bidders in terms of both their aggressive/conservative bidding and concentration/dispersion of bids. Revealing the influence of the bid scoring rules and their magnitude on bidders’ competitive behavior opens the door for the engineer to condition bidder competitive behavior in such a way as to provide the balance needed to achieve the owner’s desired strategic outcomes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data are reported on the background and performance of the K6 screening scale for serious mental illness (SMI) in the World Health Organization (WHO) World Mental Health (WMH) surveys. The K6 is a six-item scale developed to provide a brief valid screen for Diagnostic and Statistical Manual of Mental Disorders 4th edition (DSM-IV) SMI based on the criteria in the US ADAMHA Reorganization Act. Although methodological studies have documented good K6 validity in a number of countries, optimal scoring rules have never been proposed. Such rules are presented here based on analysis of K6 data in nationally or regionally representative WMH surveys in 14 countries (combined N = 41,770 respondents). Twelve-month prevalence of DSM-IV SMI was assessed with the fully-structured WHO Composite International Diagnostic Interview. Nested logistic regression analysis was used to generate estimates of the predicted probability of SMI for each respondent from K6 scores, taking into consideration the possibility of variable concordance as a function of respondent age, gender, education, and country. Concordance, assessed by calculating the area under the receiver operating characteristic curve, was generally substantial (median 0.83; range 0.76-0.89; inter-quartile range 0.81-0.85). Based on this result, optimal scaling rules are presented for use by investigators working with the K6 scale in the countries studied. Copyright (c) 2010 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents and discusses further aspects of the subjectivist interpretation of probability (also known as the 'personalist' view of probabilities) as initiated in earlier forensic and legal literature. It shows that operational devices to elicit subjective probabilities - in particular the so-called scoring rules - provide additional arguments in support of the standpoint according to which categorical claims of forensic individualisation do not follow from a formal analysis under that view of probability theory.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

What are the best voting systems in terms of utilitarianism? Or in terms of maximin, or maximax? We study these questions for the case of three alternatives and a class of structurally equivalent voting rules. We show that plurality, arguably the most widely used voting system, performs very poorly in terms of remarkable ideals of justice, such as utilitarianism or maximin, and yet is optimal in terms of maximax. Utilitarianism is bestapproached by a voting system converging to the Borda count, while the best way to achieve maximin is by means of a voting system converging to negative voting. We study the robustness of our results across different social cultures, measures of performance, and population sizes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Scoring rules are an important tool for evaluating the performance of probabilistic forecasting schemes. A scoring rule is called strictly proper if its expectation is optimal if and only if the forecast probability represents the true distribution of the target. In the binary case, strictly proper scoring rules allow for a decomposition into terms related to the resolution and the reliability of a forecast. This fact is particularly well known for the Brier Score. In this article, this result is extended to forecasts for finite-valued targets. Both resolution and reliability are shown to have a positive effect on the score. It is demonstrated that resolution and reliability are directly related to forecast attributes that are desirable on grounds independent of the notion of scores. This finding can be considered an epistemological justification of measuring forecast quality by proper scoring rules. A link is provided to the original work of DeGroot and Fienberg, extending their concepts of sufficiency and refinement. The relation to the conjectured sharpness principle of Gneiting, et al., is elucidated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

References (20)Cited By (1)Export CitationAboutAbstract Proper scoring rules provide a useful means to evaluate probabilistic forecasts. Independent from scoring rules, it has been argued that reliability and resolution are desirable forecast attributes. The mathematical expectation value of the score allows for a decomposition into reliability and resolution related terms, demonstrating a relationship between scoring rules and reliability/resolution. A similar decomposition holds for the empirical (i.e. sample average) score over an archive of forecast–observation pairs. This empirical decomposition though provides a too optimistic estimate of the potential score (i.e. the optimum score which could be obtained through recalibration), showing that a forecast assessment based solely on the empirical resolution and reliability terms will be misleading. The differences between the theoretical and empirical decomposition are investigated, and specific recommendations are given how to obtain better estimators of reliability and resolution in the case of the Brier and Ignorance scoring rule.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The continuous ranked probability score (CRPS) is a frequently used scoring rule. In contrast with many other scoring rules, the CRPS evaluates cumulative distribution functions. An ensemble of forecasts can easily be converted into a piecewise constant cumulative distribution function with steps at the ensemble members. This renders the CRPS a convenient scoring rule for the evaluation of ‘raw’ ensembles, obviating the need for sophisticated ensemble model output statistics or dressing methods prior to evaluation. In this article, a relation between the CRPS score and the quantile score is established. The evaluation of ‘raw’ ensembles using the CRPS is discussed in this light. It is shown that latent in this evaluation is an interpretation of the ensemble as quantiles but with non-uniform levels. This needs to be taken into account if the ensemble is evaluated further, for example with rank histograms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider tests of forecast encompassing for probability forecasts, for both quadratic and logarithmic scoring rules. We propose test statistics for the null of forecast encompassing, present the limiting distributions of the test statistics, and investigate the impact of estimating the forecasting models' parameters on these distributions. The small-sample performance is investigated, in terms of small numbers of forecasts and model estimation sample sizes. We show the usefulness of the tests for the evaluation of recession probability forecasts from logit models with different leading indicators as explanatory variables, and for evaluating survey-based probability forecasts.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When making predictions with complex simulators it can be important to quantify the various sources of uncertainty. Errors in the structural specification of the simulator, for example due to missing processes or incorrect mathematical specification, can be a major source of uncertainty, but are often ignored. We introduce a methodology for inferring the discrepancy between the simulator and the system in discrete-time dynamical simulators. We assume a structural form for the discrepancy function, and show how to infer the maximum-likelihood parameter estimates using a particle filter embedded within a Monte Carlo expectation maximization (MCEM) algorithm. We illustrate the method on a conceptual rainfall-runoff simulator (logSPM) used to model the Abercrombie catchment in Australia. We assess the simulator and discrepancy model on the basis of their predictive performance using proper scoring rules. This article has supplementary material online. © 2011 International Biometric Society.