32 resultados para probability distribution
Resumo:
This paper proposes and demonstrates an approach, Skilloscopy, to the assessment of decision makers. In an increasingly sophisticated, connected and information-rich world, decision making is becoming both more important and more difficult. At the same time, modelling decision-making on computers is becoming more feasible and of interest, partly because the information-input to those decisions is increasingly on record. The aims of Skilloscopy are to rate and rank decision makers in a domain relative to each other: the aims do not include an analysis of why a decision is wrong or suboptimal, nor the modelling of the underlying cognitive process of making the decisions. In the proposed method a decision-maker is characterised by a probability distribution of their competence in choosing among quantifiable alternatives. This probability distribution is derived by classic Bayesian inference from a combination of prior belief and the evidence of the decisions. Thus, decision-makers’ skills may be better compared, rated and ranked. The proposed method is applied and evaluated in the gamedomain of Chess. A large set of games by players across a broad range of the World Chess Federation (FIDE) Elo ratings has been used to infer the distribution of players’ rating directly from the moves they play rather than from game outcomes. Demonstration applications address questions frequently asked by the Chess community regarding the stability of the Elo rating scale, the comparison of players of different eras and/or leagues, and controversial incidents possibly involving fraud. The method of Skilloscopy may be applied in any decision domain where the value of the decision-options can be quantified.
Resumo:
The evidence provided by modelled assessments of future climate impact on flooding is fundamental to water resources and flood risk decision making. Impact models usually rely on climate projections from global and regional climate models (GCM/RCMs). However, challenges in representing precipitation events at catchment-scale resolution mean that decisions must be made on how to appropriately pre-process the meteorological variables from GCM/RCMs. Here the impacts on projected high flows of differing ensemble approaches and application of Model Output Statistics to RCM precipitation are evaluated while assessing climate change impact on flood hazard in the Upper Severn catchment in the UK. Various ensemble projections are used together with the HBV hydrological model with direct forcing and also compared to a response surface technique. We consider an ensemble of single-model RCM projections from the current UK Climate Projections (UKCP09); multi-model ensemble RCM projections from the European Union's FP6 ‘ENSEMBLES’ project; and a joint probability distribution of precipitation and temperature from a GCM-based perturbed physics ensemble. The ensemble distribution of results show that flood hazard in the Upper Severn is likely to increase compared to present conditions, but the study highlights the differences between the results from different ensemble methods and the strong assumptions made in using Model Output Statistics to produce the estimates of future river discharge. The results underline the challenges in using the current generation of RCMs for local climate impact studies on flooding. Copyright © 2012 Royal Meteorological Society
Resumo:
A method is presented to calculate economic optimum fungicide doses accounting for the risk-aversion of growers responding to variability in disease severity between crops. Simple dose-response and disease-yield loss functions are used to estimate net disease-related costs (fungicide cost, plus disease-induced yield loss) as a function of dose and untreated severity. With fairly general assumptions about the shapes of the probability distribution of disease severity and the other functions involved, we show that a choice of fungicide dose which minimises net costs on average across seasons results in occasional large net costs caused by inadequate control in high disease seasons. This may be unacceptable to a grower with limited capital. A risk-averse grower can choose to reduce the size and frequency of such losses by applying a higher dose as insurance. For example, a grower may decide to accept ‘high loss’ years one year in ten or one year in twenty (i.e. specifying a proportion of years in which disease severity and net costs will be above a specified level). Our analysis shows that taking into account disease severity variation and risk-aversion will usually increase the dose applied by an economically rational grower. The analysis is illustrated with data on septoria tritici leaf blotch of wheat caused by Mycosphaerella graminicola. Observations from untreated field plots at sites across England over three years were used to estimate the probability distribution of disease severities at mid-grain filling. In the absence of a fully reliable disease forecasting scheme, reducing the frequency of ‘high loss’ years requires substantially higher doses to be applied to all crops. Disease resistant cultivars reduce both the optimal dose at all levels of risk and the disease-related costs at all doses.
Resumo:
Airborne high resolution in situ measurements of a large set of trace gases including ozone (O3) and total water (H2O) in the upper troposphere and the lowermost stratosphere (UT/LMS) have been performed above Europe within the SPURT project. SPURT provides an extensive data coverage of the UT/LMS in each season within the time period between November 2001 and July 2003. In the LMS a distinct spring maximum and autumn minimum is observed in O3, whereas its annual cycle in the UT is shifted by 2–3 months later towards the end of the year. The more variable H2O measurements reveal a maximum during summer and a minimum during autumn/winter with no phase shift between the two atmospheric compartments. For a comprehensive insight into trace gas composition and variability in the UT/LMS several statistical methods are applied using chemical, thermal and dynamical vertical coordinates. In particular, 2-dimensional probability distribution functions serve as a tool to transform localised aircraft data to a more comprehensive view of the probed atmospheric region. It appears that both trace gases, O3 and H2O, reveal the most compact arrangement and are best correlated in the view of potential vorticity (PV) and distance to the local tropopause, indicating an advanced mixing state on these surfaces. Thus, strong gradients of PV seem to act as a transport barrier both in the vertical and the horizontal direction. The alignment of trace gas isopleths reflects the existence of a year-round extra-tropical tropopause transition layer. The SPURT measurements reveal that this layer is mainly affected by stratospheric air during winter/spring and by tropospheric air during autumn/summer. Normalised mixing entropy values for O3 and H2O in the LMS appear to be maximal during spring and summer, respectively, indicating highest variability of these trace gases during the respective seasons.
Resumo:
Particle filters are fully non-linear data assimilation techniques that aim to represent the probability distribution of the model state given the observations (the posterior) by a number of particles. In high-dimensional geophysical applications the number of particles required by the sequential importance resampling (SIR) particle filter in order to capture the high probability region of the posterior, is too large to make them usable. However particle filters can be formulated using proposal densities, which gives greater freedom in how particles are sampled and allows for a much smaller number of particles. Here a particle filter is presented which uses the proposal density to ensure that all particles end up in the high probability region of the posterior probability density function. This gives rise to the possibility of non-linear data assimilation in large dimensional systems. The particle filter formulation is compared to the optimal proposal density particle filter and the implicit particle filter, both of which also utilise a proposal density. We show that when observations are available every time step, both schemes will be degenerate when the number of independent observations is large, unlike the new scheme. The sensitivity of the new scheme to its parameter values is explored theoretically and demonstrated using the Lorenz (1963) model.
Resumo:
In nature, living creatures are affected by several stimuli simultaneously. The response of living creatures to stimuli is called taxis. In order to reveal the principles of taxis behavior in response to complex stimuli, we simultaneously applied photostimulation and electric stimulation perpendicularly to a Volvox algae solution. The probability distribution of the swimming direction showed that a large population of swimming cells moved in a direction that was the result of the composition of phototaxis and electrotaxis. More surprisingly, we uncovered the coupling of signs of taxis, i.e., coupling of phototaxis and electrotaxis induced positive electrotaxis, which did not emerge in the single stimulation experiments. We qualitatively explained the coupling of taxis based on the polarization of the swimming cells induced by the simultaneous photo- and electric stimulation.
Resumo:
We apply a new parameterisation of the Greenland ice sheet (GrIS) feedback between surface mass balance (SMB: the sum of surface accumulation and surface ablation) and surface elevation in the MAR regional climate model (Edwards et al., 2014) to projections of future climate change using five ice sheet models (ISMs). The MAR (Modèle Atmosphérique Régional: Fettweis, 2007) climate projections are for 2000–2199, forced by the ECHAM5 and HadCM3 global climate models (GCMs) under the SRES A1B emissions scenario. The additional sea level contribution due to the SMB– elevation feedback averaged over five ISM projections for ECHAM5 and three for HadCM3 is 4.3% (best estimate; 95% credibility interval 1.8–6.9 %) at 2100, and 9.6% (best estimate; 95% credibility interval 3.6–16.0 %) at 2200. In all results the elevation feedback is significantly positive, amplifying the GrIS sea level contribution relative to the MAR projections in which the ice sheet topography is fixed: the lower bounds of our 95% credibility intervals (CIs) for sea level contributions are larger than the “no feedback” case for all ISMs and GCMs. Our method is novel in sea level projections because we propagate three types of modelling uncertainty – GCM and ISM structural uncertainties, and elevation feedback parameterisation uncertainty – along the causal chain, from SRES scenario to sea level, within a coherent experimental design and statistical framework. The relative contributions to uncertainty depend on the timescale of interest. At 2100, the GCM uncertainty is largest, but by 2200 both the ISM and parameterisation uncertainties are larger. We also perform a perturbed parameter ensemble with one ISM to estimate the shape of the projected sea level probability distribution; our results indicate that the probability density is slightly skewed towards higher sea level contributions.
Resumo:
Advanced forecasting of space weather requires simulation of the whole Sun-to-Earth system, which necessitates driving magnetospheric models with the outputs from solar wind models. This presents a fundamental difficulty, as the magnetosphere is sensitive to both large-scale solar wind structures, which can be captured by solar wind models, and small-scale solar wind “noise,” which is far below typical solar wind model resolution and results primarily from stochastic processes. Following similar approaches in terrestrial climate modeling, we propose statistical “downscaling” of solar wind model results prior to their use as input to a magnetospheric model. As magnetospheric response can be highly nonlinear, this is preferable to downscaling the results of magnetospheric modeling. To demonstrate the benefit of this approach, we first approximate solar wind model output by smoothing solar wind observations with an 8 h filter, then add small-scale structure back in through the addition of random noise with the observed spectral characteristics. Here we use a very simple parameterization of noise based upon the observed probability distribution functions of solar wind parameters, but more sophisticated methods will be developed in the future. An ensemble of results from the simple downscaling scheme are tested using a model-independent method and shown to add value to the magnetospheric forecast, both improving the best estimate and quantifying the uncertainty. We suggest a number of features desirable in an operational solar wind downscaling scheme.
Resumo:
Large waves pose risks to ships, offshore structures, coastal infrastructure and ecosystems. This paper analyses 10 years of in-situ measurements of significant wave height (Hs) and maximum wave height (Hmax) from the ocean weather ship Polarfront in the Norwegian Sea. During the period 2000 to 2009, surface elevation was recorded every 0.59 s during sampling periods of 30 min. The Hmax observations scale linearly with Hs on average. A widely-used empirical Weibull distribution is found to estimate average values of Hmax/Hs and Hmax better than a Rayleigh distribution, but tends to underestimate both for all but the smallest waves. In this paper we propose a modified Rayleigh distribution which compensates for the heterogeneity of the observed dataset: the distribution is fitted to the whole dataset and improves the estimate of the largest waves. Over the 10-year period, the Weibull distribution approximates the observed Hs and Hmax well, and an exponential function can be used to predict the probability distribution function of the ratio Hmax/Hs. However, the Weibull distribution tends to underestimate the occurrence of extremely large values of Hs and Hmax. The persistence of Hs and Hmax in winter is also examined. Wave fields with Hs>12 m and Hmax>16 m do not last longer than 3 h. Low-to-moderate wave heights that persist for more than 12 h dominate the relationship of the wave field with the winter NAO index over 2000–2009. In contrast, the inter-annual variability of wave fields with Hs>5.5 m or Hmax>8.5 m and wave fields persisting over ~2.5 days is not associated with the winter NAO index.
Resumo:
Wind generation's contribution to supporting peak electricity demand is one of the key questions in wind integration studies. Differently from conventional units, the available outputs of different wind farms cannot be approximated as being statistically independent, and hence near-zero wind output is possible across an entire power system. This paper will review the risk model structures currently used to assess wind's capacity value, along with discussion of the resulting data requirements. A central theme is the benefits from performing statistical estimation of the joint distribution for demand and available wind capacity, focusing attention on uncertainties due to limited histories of wind and demand data; examination of Great Britain data from the last 25 years shows that the data requirements are greater than generally thought. A discussion is therefore presented into how analysis of the types of weather system which have historically driven extreme electricity demands can help to deliver robust insights into wind's contribution to supporting demand, even in the face of such data limitations. The role of the form of the probability distribution for available conventional capacity in driving wind capacity credit results is also discussed.
Resumo:
This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to output a valid probability interval. The methodology is designed for mass spectrometry data. For demonstrative purposes, we applied this methodology to MALDI-TOF data sets in order to predict the diagnosis of heart disease and early diagnoses of ovarian cancer and breast cancer. The experiments showed that probability intervals are narrow, that is, the output of the multiprobability predictor is similar to a single probability distribution. In addition, probability intervals produced for heart disease and ovarian cancer data were more accurate than the output of corresponding probability predictor. When Venn machines were forced to make point predictions, the accuracy of such predictions is for the most data better than the accuracy of the underlying algorithm that outputs single probability distribution of a label. Application of this methodology to MALDI-TOF data sets empirically demonstrates the validity. The accuracy of the proposed method on ovarian cancer data rises from 66.7 % 11 months in advance of the moment of diagnosis to up to 90.2 % at the moment of diagnosis. The same approach has been applied to heart disease data without time dependency, although the achieved accuracy was not as high (up to 69.9 %). The methodology allowed us to confirm mass spectrometry peaks previously identified as carrying statistically significant information for discrimination between controls and cases.
Resumo:
The co-polar correlation coefficient (ρhv) has many applications, including hydrometeor classification, ground clutter and melting layer identification, interpretation of ice microphysics and the retrieval of rain drop size distributions (DSDs). However, we currently lack the quantitative error estimates that are necessary if these applications are to be fully exploited. Previous error estimates of ρhv rely on knowledge of the unknown "true" ρhv and implicitly assume a Gaussian probability distribution function of ρhv samples. We show that frequency distributions of ρhv estimates are in fact highly negatively skewed. A new variable: L = -log10(1 - ρhv) is defined, which does have Gaussian error statistics, and a standard deviation depending only on the number of independent radar pulses. This is verified using observations of spherical drizzle drops, allowing, for the first time, the construction of rigorous confidence intervals in estimates of ρhv. In addition, we demonstrate how the imperfect co-location of the horizontal and vertical polarisation sample volumes may be accounted for. The possibility of using L to estimate the dispersion parameter (µ) in the gamma drop size distribution is investigated. We find that including drop oscillations is essential for this application, otherwise there could be biases in retrieved µ of up to ~8. Preliminary results in rainfall are presented. In a convective rain case study, our estimates show µ to be substantially larger than 0 (an exponential DSD). In this particular rain event, rain rate would be overestimated by up to 50% if a simple exponential DSD is assumed.
Resumo:
The spatial distribution of CO2 level in a classroom carried out in previous field work research has demonstrated that there is some evidence of variations in CO2 concentration in a classroom space. Significant fluctuations in CO2 concentration were found at different sampling points depending on the ventilation strategies and environmental conditions prevailing in individual classrooms. However, how these variations are affected by the emitting sources and the room air movement remains unknown. Hence, it was concluded that detailed investigation of the CO2 distribution need to be performed on a smaller scale. As a result, it was decided to use an environmental chamber with various methods and rates of ventilation, for the same internal temperature and heat loads, to study the effect of ventilation strategy and air movement on the distribution of CO2 concentration in a room. The role of human exhalation and its interaction with the plume induced by the body's convective flow and room air movement due to different ventilation strategies were studied in a chamber at the University of Reading. These phenomena are considered to be important in understanding and predicting the flow patterns in a space and how these impact on the distribution of contaminants. This paper attempts to study the CO2 dispersion and distribution at the exhalation zone of two people sitting in a chamber as well as throughout the occupied zone of the chamber. The horizontal and vertical distributions of CO2 were sampled at locations with a probability that CO2 variation is considered high. Although the room size, source location, ventilation rate and location of air supply and extract devices all can have influence on the CO2 distribution, this article gives general guidelines on the optimum positioning of CO2 sensor in a room.
Resumo:
This paper explores a new technique to calculate and plot the distribution of instantaneous transmit envelope power of OFDMA and SC-FDMA signals from the equation of Probability Density Function (PDF) solved numerically. The Complementary Cumulative Distribution Function (CCDF) of Instantaneous Power to Average Power Ratio (IPAPR) is computed from the structure of the transmit system matrix. This helps intuitively understand the distribution of output signal power if the structure of the transmit system matrix and the constellation used are known. The distribution obtained for OFDMA signal matches complex normal distribution. The results indicate why the CCDF of IPAPR in case of SC-FDMA is better than OFDMA for a given constellation. Finally, with this method it is shown again that cyclic prefixed DS-CDMA system is one case with optimum IPAPR. The insight that this technique provides may be useful in designing area optimised digital and power efficient analogue modules.
Resumo:
The continuous ranked probability score (CRPS) is a frequently used scoring rule. In contrast with many other scoring rules, the CRPS evaluates cumulative distribution functions. An ensemble of forecasts can easily be converted into a piecewise constant cumulative distribution function with steps at the ensemble members. This renders the CRPS a convenient scoring rule for the evaluation of ‘raw’ ensembles, obviating the need for sophisticated ensemble model output statistics or dressing methods prior to evaluation. In this article, a relation between the CRPS score and the quantile score is established. The evaluation of ‘raw’ ensembles using the CRPS is discussed in this light. It is shown that latent in this evaluation is an interpretation of the ensemble as quantiles but with non-uniform levels. This needs to be taken into account if the ensemble is evaluated further, for example with rank histograms.