909 resultados para scoring rules


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the global construction context, the Best Value or Most Economically Advantageous Tender is becoming a widespread approach for contractor selection, as an alternative to other traditional awarding criteria such as the Lowest Price. In these multi-attribute tenders, the owner or auctioneer solicits proposals containing both a price bid and additional technical features. Once the proposals are received, each bidder's price bid is given an economic score according to a scoring rule, generally called an Economic Scoring Formula (ESF) and a technical score according to pre-specified criteria. Eventually, the contract is awarded to the bidder with the highest weighted overall score (economic + technical). However, Economic Scoring Formula selection by auctioneers is invariably and paradoxically a highly intuitive process in practice, involving few theoretical or empirical considerations, despite having being considered traditionally and mistakenly as objective, due to its mathematical nature. This paper provides a taxonomic classification of a wide variety of ESF and Abnormally Low Bid Criteria (ALBC) gathered in several countries with different tendering approaches. Practical implications concern the optimal design of price scoring rules in construction contract tenders, as well as future analyses of the effects of ESF and ALBC on competitive bidding behaviour.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A disadvantage of multiple-choice tests is that students have incentives to guess. To discourage guessing, it is common to use scoring rules that either penalize wrong answers or reward omissions. These scoring rules are considered equivalent in psychometrics, although experimental evidence has not always been consistent with this claim. We model students' decisions and show, first, that equivalence holds only under risk neutrality and, second, that the two rules can be modified so that they become equivalent even under risk aversion. This paper presents the results of a field experiment in which we analyze the decisions of subjects taking multiple-choice exams. The evidence suggests that differences between scoring rules are due to risk aversion as theory predicts. We also find that the number of omitted items depends on the scoring rule, knowledge, gender and other covariates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are several scoring rules that one can choose from in order to score probabilistic forecasting models or estimate model parameters. Whilst it is generally agreed that proper scoring rules are preferable, there is no clear criterion for preferring one proper scoring rule above another. This manuscript compares and contrasts some commonly used proper scoring rules and provides guidance on scoring rule selection. In particular, it is shown that the logarithmic scoring rule prefers erring with more uncertainty, the spherical scoring rule prefers erring with lower uncertainty, whereas the other scoring rules are indifferent to either option.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the global construction context, the best value or most economically advantageous tender is becoming a widespread approach for contractor selection, as an alternative to other traditional awarding criteria such as the lowest price. In these multi-attribute tenders, the owner or auctioneer solicits proposals containing both a price bid and additional technical features. Once the proposals are received, each bidder’s price bid is given an economic score according to a scoring rule, generally called an economic scoring formula (ESF) and a technical score according to pre-specified criteria. Eventually, the contract is awarded to the bidder with the highest weighted overall score (economic + technical). However, economic scoring formula selection by auctioneers is invariably and paradoxically a highly intuitive process in practice, involving few theoretical or empirical considerations, despite having been considered traditionally and mistakenly as objective, due to its mathematical nature. This paper provides a taxonomic classification of a wide variety of ESFs and abnormally low bids criteria (ALBC) gathered in several countries with different tendering approaches. Practical implications concern the optimal design of price scoring rules in construction contract tenders, as well as future analyses of the effects of the ESF and ALBC on competitive bidding behaviour.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper examines the extent to which engineers can influence the competitive behavior of bidders in Best Value or multi-attribute construction auctions, where both the (dollar) bid and technical non-price criteria are scored according to a scoring rule. From a sample of Spanish construction auctions with a variety of bid scoring rules, it is found that bidders are influenced by the auction rules in significant and predictable ways. The bid score weighting, bid scoring formula and abnormally low bid criterion are variables likely to influence the competitiveness of bidders in terms of both their aggressive/conservative bidding and concentration/dispersion of bids. Revealing the influence of the bid scoring rules and their magnitude on bidders’ competitive behavior opens the door for the engineer to condition bidder competitive behavior in such a way as to provide the balance needed to achieve the owner’s desired strategic outcomes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We compare three alternative methods for eliciting retrospective confidence in the context of a simple perceptual task: the Simple Confidence Rating (a direct report on a numerical scale), the Quadratic Scoring Rule (a post-wagering procedure), and the Matching Probability (MP; a generalization of the no-loss gambling method). We systematically compare the results obtained with these three rules to the theoretical confidence levels that can be inferred from performance in the perceptual task using Signal Detection Theory (SDT). We find that the MP provides better results in that respect. We conclude that MP is particularly well suited for studies of confidence that use SDT as a theoretical framework.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Query incentive networks capture the role of incentives in extracting information from decentralized information networks such as a social network. Several game theoretic tilt:Kids of query incentive networks have been proposed in the literature to study and characterize the dependence, of the monetary reward required to extract the answer for a query, on various factors such as the structure of the network, the level of difficulty of the query, and the required success probability.None of the existing models, however, captures the practical andimportant factor of quality of answers. In this paper, we develop a complete mechanism design based framework to incorporate the quality of answers, in the monetization of query incentive networks. First, we extend the model of Kleinberg and Raghavan [2] to allow the nodes to modulate the incentive on the basis of the quality of the answer they receive. For this qualify conscious model. we show are existence of a unique Nash equilibrium and study the impact of quality of answers on the growth rate of the initial reward, with respect to the branching factor of the network. Next, we present two mechanisms; the direct comparison mechanism and the peer prediction mechanism, for truthful elicitation of quality from the agents. These mechanisms are based on scoring rules and cover different; scenarios which may arise in query incentive networks. We show that the proposed quality elicitation mechanisms are incentive compatible and ex-ante budget balanced. We also derive conditions under which ex-post budget balance can beachieved by these mechanisms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the POSSIBLE WINNER problem in computational social choice theory, we are given a set of partial preferences and the question is whether a distinguished candidate could be made winner by extending the partial preferences to linear preferences. Previous work has provided, for many common voting rules, fixed parameter tractable algorithms for the POSSIBLE WINNER problem, with number of candidates as the parameter. However, the corresponding kernelization question is still open and in fact, has been mentioned as a key research challenge 10]. In this paper, we settle this open question for many common voting rules. We show that the POSSIBLE WINNER problem for maximin, Copeland, Bucklin, ranked pairs, and a class of scoring rules that includes the Borda voting rule does not admit a polynomial kernel with the number of candidates as the parameter. We show however that the COALITIONAL MANIPULATION problem which is an important special case of the POSSIBLE WINNER problem does admit a polynomial kernel for maximin, Copeland, ranked pairs, and a class of scoring rules that includes the Borda voting rule, when the number of manipulators is polynomial in the number of candidates. A significant conclusion of our work is that the POSSIBLE WINNER problem is harder than the COALITIONAL MANIPULATION problem since the COALITIONAL MANIPULATION problem admits a polynomial kernel whereas the POSSIBLE WINNER problem does not admit a polynomial kernel. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Scoring rules are an important tool for evaluating the performance of probabilistic forecasting schemes. A scoring rule is called strictly proper if its expectation is optimal if and only if the forecast probability represents the true distribution of the target. In the binary case, strictly proper scoring rules allow for a decomposition into terms related to the resolution and the reliability of a forecast. This fact is particularly well known for the Brier Score. In this article, this result is extended to forecasts for finite-valued targets. Both resolution and reliability are shown to have a positive effect on the score. It is demonstrated that resolution and reliability are directly related to forecast attributes that are desirable on grounds independent of the notion of scores. This finding can be considered an epistemological justification of measuring forecast quality by proper scoring rules. A link is provided to the original work of DeGroot and Fienberg, extending their concepts of sufficiency and refinement. The relation to the conjectured sharpness principle of Gneiting, et al., is elucidated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

References (20)Cited By (1)Export CitationAboutAbstract Proper scoring rules provide a useful means to evaluate probabilistic forecasts. Independent from scoring rules, it has been argued that reliability and resolution are desirable forecast attributes. The mathematical expectation value of the score allows for a decomposition into reliability and resolution related terms, demonstrating a relationship between scoring rules and reliability/resolution. A similar decomposition holds for the empirical (i.e. sample average) score over an archive of forecast–observation pairs. This empirical decomposition though provides a too optimistic estimate of the potential score (i.e. the optimum score which could be obtained through recalibration), showing that a forecast assessment based solely on the empirical resolution and reliability terms will be misleading. The differences between the theoretical and empirical decomposition are investigated, and specific recommendations are given how to obtain better estimators of reliability and resolution in the case of the Brier and Ignorance scoring rule.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The continuous ranked probability score (CRPS) is a frequently used scoring rule. In contrast with many other scoring rules, the CRPS evaluates cumulative distribution functions. An ensemble of forecasts can easily be converted into a piecewise constant cumulative distribution function with steps at the ensemble members. This renders the CRPS a convenient scoring rule for the evaluation of ‘raw’ ensembles, obviating the need for sophisticated ensemble model output statistics or dressing methods prior to evaluation. In this article, a relation between the CRPS score and the quantile score is established. The evaluation of ‘raw’ ensembles using the CRPS is discussed in this light. It is shown that latent in this evaluation is an interpretation of the ensemble as quantiles but with non-uniform levels. This needs to be taken into account if the ensemble is evaluated further, for example with rank histograms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider tests of forecast encompassing for probability forecasts, for both quadratic and logarithmic scoring rules. We propose test statistics for the null of forecast encompassing, present the limiting distributions of the test statistics, and investigate the impact of estimating the forecasting models' parameters on these distributions. The small-sample performance is investigated, in terms of small numbers of forecasts and model estimation sample sizes. We show the usefulness of the tests for the evaluation of recession probability forecasts from logit models with different leading indicators as explanatory variables, and for evaluating survey-based probability forecasts.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When making predictions with complex simulators it can be important to quantify the various sources of uncertainty. Errors in the structural specification of the simulator, for example due to missing processes or incorrect mathematical specification, can be a major source of uncertainty, but are often ignored. We introduce a methodology for inferring the discrepancy between the simulator and the system in discrete-time dynamical simulators. We assume a structural form for the discrepancy function, and show how to infer the maximum-likelihood parameter estimates using a particle filter embedded within a Monte Carlo expectation maximization (MCEM) algorithm. We illustrate the method on a conceptual rainfall-runoff simulator (logSPM) used to model the Abercrombie catchment in Australia. We assess the simulator and discrepancy model on the basis of their predictive performance using proper scoring rules. This article has supplementary material online. © 2011 International Biometric Society.