973 resultados para Gaussian Probability Distribution
Resumo:
This paper proposes and demonstrates an approach, Skilloscopy, to the assessment of decision makers. In an increasingly sophisticated, connected and information-rich world, decision making is becoming both more important and more difficult. At the same time, modelling decision-making on computers is becoming more feasible and of interest, partly because the information-input to those decisions is increasingly on record. The aims of Skilloscopy are to rate and rank decision makers in a domain relative to each other: the aims do not include an analysis of why a decision is wrong or suboptimal, nor the modelling of the underlying cognitive process of making the decisions. In the proposed method a decision-maker is characterised by a probability distribution of their competence in choosing among quantifiable alternatives. This probability distribution is derived by classic Bayesian inference from a combination of prior belief and the evidence of the decisions. Thus, decision-makers’ skills may be better compared, rated and ranked. The proposed method is applied and evaluated in the gamedomain of Chess. A large set of games by players across a broad range of the World Chess Federation (FIDE) Elo ratings has been used to infer the distribution of players’ rating directly from the moves they play rather than from game outcomes. Demonstration applications address questions frequently asked by the Chess community regarding the stability of the Elo rating scale, the comparison of players of different eras and/or leagues, and controversial incidents possibly involving fraud. The method of Skilloscopy may be applied in any decision domain where the value of the decision-options can be quantified.
Resumo:
The evidence provided by modelled assessments of future climate impact on flooding is fundamental to water resources and flood risk decision making. Impact models usually rely on climate projections from global and regional climate models (GCM/RCMs). However, challenges in representing precipitation events at catchment-scale resolution mean that decisions must be made on how to appropriately pre-process the meteorological variables from GCM/RCMs. Here the impacts on projected high flows of differing ensemble approaches and application of Model Output Statistics to RCM precipitation are evaluated while assessing climate change impact on flood hazard in the Upper Severn catchment in the UK. Various ensemble projections are used together with the HBV hydrological model with direct forcing and also compared to a response surface technique. We consider an ensemble of single-model RCM projections from the current UK Climate Projections (UKCP09); multi-model ensemble RCM projections from the European Union's FP6 ‘ENSEMBLES’ project; and a joint probability distribution of precipitation and temperature from a GCM-based perturbed physics ensemble. The ensemble distribution of results show that flood hazard in the Upper Severn is likely to increase compared to present conditions, but the study highlights the differences between the results from different ensemble methods and the strong assumptions made in using Model Output Statistics to produce the estimates of future river discharge. The results underline the challenges in using the current generation of RCMs for local climate impact studies on flooding. Copyright © 2012 Royal Meteorological Society
Resumo:
A method is presented to calculate economic optimum fungicide doses accounting for the risk-aversion of growers responding to variability in disease severity between crops. Simple dose-response and disease-yield loss functions are used to estimate net disease-related costs (fungicide cost, plus disease-induced yield loss) as a function of dose and untreated severity. With fairly general assumptions about the shapes of the probability distribution of disease severity and the other functions involved, we show that a choice of fungicide dose which minimises net costs on average across seasons results in occasional large net costs caused by inadequate control in high disease seasons. This may be unacceptable to a grower with limited capital. A risk-averse grower can choose to reduce the size and frequency of such losses by applying a higher dose as insurance. For example, a grower may decide to accept ‘high loss’ years one year in ten or one year in twenty (i.e. specifying a proportion of years in which disease severity and net costs will be above a specified level). Our analysis shows that taking into account disease severity variation and risk-aversion will usually increase the dose applied by an economically rational grower. The analysis is illustrated with data on septoria tritici leaf blotch of wheat caused by Mycosphaerella graminicola. Observations from untreated field plots at sites across England over three years were used to estimate the probability distribution of disease severities at mid-grain filling. In the absence of a fully reliable disease forecasting scheme, reducing the frequency of ‘high loss’ years requires substantially higher doses to be applied to all crops. Disease resistant cultivars reduce both the optimal dose at all levels of risk and the disease-related costs at all doses.
Resumo:
Airborne high resolution in situ measurements of a large set of trace gases including ozone (O3) and total water (H2O) in the upper troposphere and the lowermost stratosphere (UT/LMS) have been performed above Europe within the SPURT project. SPURT provides an extensive data coverage of the UT/LMS in each season within the time period between November 2001 and July 2003. In the LMS a distinct spring maximum and autumn minimum is observed in O3, whereas its annual cycle in the UT is shifted by 2–3 months later towards the end of the year. The more variable H2O measurements reveal a maximum during summer and a minimum during autumn/winter with no phase shift between the two atmospheric compartments. For a comprehensive insight into trace gas composition and variability in the UT/LMS several statistical methods are applied using chemical, thermal and dynamical vertical coordinates. In particular, 2-dimensional probability distribution functions serve as a tool to transform localised aircraft data to a more comprehensive view of the probed atmospheric region. It appears that both trace gases, O3 and H2O, reveal the most compact arrangement and are best correlated in the view of potential vorticity (PV) and distance to the local tropopause, indicating an advanced mixing state on these surfaces. Thus, strong gradients of PV seem to act as a transport barrier both in the vertical and the horizontal direction. The alignment of trace gas isopleths reflects the existence of a year-round extra-tropical tropopause transition layer. The SPURT measurements reveal that this layer is mainly affected by stratospheric air during winter/spring and by tropospheric air during autumn/summer. Normalised mixing entropy values for O3 and H2O in the LMS appear to be maximal during spring and summer, respectively, indicating highest variability of these trace gases during the respective seasons.
Resumo:
Particle filters are fully non-linear data assimilation techniques that aim to represent the probability distribution of the model state given the observations (the posterior) by a number of particles. In high-dimensional geophysical applications the number of particles required by the sequential importance resampling (SIR) particle filter in order to capture the high probability region of the posterior, is too large to make them usable. However particle filters can be formulated using proposal densities, which gives greater freedom in how particles are sampled and allows for a much smaller number of particles. Here a particle filter is presented which uses the proposal density to ensure that all particles end up in the high probability region of the posterior probability density function. This gives rise to the possibility of non-linear data assimilation in large dimensional systems. The particle filter formulation is compared to the optimal proposal density particle filter and the implicit particle filter, both of which also utilise a proposal density. We show that when observations are available every time step, both schemes will be degenerate when the number of independent observations is large, unlike the new scheme. The sensitivity of the new scheme to its parameter values is explored theoretically and demonstrated using the Lorenz (1963) model.
Resumo:
In nature, living creatures are affected by several stimuli simultaneously. The response of living creatures to stimuli is called taxis. In order to reveal the principles of taxis behavior in response to complex stimuli, we simultaneously applied photostimulation and electric stimulation perpendicularly to a Volvox algae solution. The probability distribution of the swimming direction showed that a large population of swimming cells moved in a direction that was the result of the composition of phototaxis and electrotaxis. More surprisingly, we uncovered the coupling of signs of taxis, i.e., coupling of phototaxis and electrotaxis induced positive electrotaxis, which did not emerge in the single stimulation experiments. We qualitatively explained the coupling of taxis based on the polarization of the swimming cells induced by the simultaneous photo- and electric stimulation.
Resumo:
We apply a new parameterisation of the Greenland ice sheet (GrIS) feedback between surface mass balance (SMB: the sum of surface accumulation and surface ablation) and surface elevation in the MAR regional climate model (Edwards et al., 2014) to projections of future climate change using five ice sheet models (ISMs). The MAR (Modèle Atmosphérique Régional: Fettweis, 2007) climate projections are for 2000–2199, forced by the ECHAM5 and HadCM3 global climate models (GCMs) under the SRES A1B emissions scenario. The additional sea level contribution due to the SMB– elevation feedback averaged over five ISM projections for ECHAM5 and three for HadCM3 is 4.3% (best estimate; 95% credibility interval 1.8–6.9 %) at 2100, and 9.6% (best estimate; 95% credibility interval 3.6–16.0 %) at 2200. In all results the elevation feedback is significantly positive, amplifying the GrIS sea level contribution relative to the MAR projections in which the ice sheet topography is fixed: the lower bounds of our 95% credibility intervals (CIs) for sea level contributions are larger than the “no feedback” case for all ISMs and GCMs. Our method is novel in sea level projections because we propagate three types of modelling uncertainty – GCM and ISM structural uncertainties, and elevation feedback parameterisation uncertainty – along the causal chain, from SRES scenario to sea level, within a coherent experimental design and statistical framework. The relative contributions to uncertainty depend on the timescale of interest. At 2100, the GCM uncertainty is largest, but by 2200 both the ISM and parameterisation uncertainties are larger. We also perform a perturbed parameter ensemble with one ISM to estimate the shape of the projected sea level probability distribution; our results indicate that the probability density is slightly skewed towards higher sea level contributions.
Resumo:
Advanced forecasting of space weather requires simulation of the whole Sun-to-Earth system, which necessitates driving magnetospheric models with the outputs from solar wind models. This presents a fundamental difficulty, as the magnetosphere is sensitive to both large-scale solar wind structures, which can be captured by solar wind models, and small-scale solar wind “noise,” which is far below typical solar wind model resolution and results primarily from stochastic processes. Following similar approaches in terrestrial climate modeling, we propose statistical “downscaling” of solar wind model results prior to their use as input to a magnetospheric model. As magnetospheric response can be highly nonlinear, this is preferable to downscaling the results of magnetospheric modeling. To demonstrate the benefit of this approach, we first approximate solar wind model output by smoothing solar wind observations with an 8 h filter, then add small-scale structure back in through the addition of random noise with the observed spectral characteristics. Here we use a very simple parameterization of noise based upon the observed probability distribution functions of solar wind parameters, but more sophisticated methods will be developed in the future. An ensemble of results from the simple downscaling scheme are tested using a model-independent method and shown to add value to the magnetospheric forecast, both improving the best estimate and quantifying the uncertainty. We suggest a number of features desirable in an operational solar wind downscaling scheme.
Resumo:
Large waves pose risks to ships, offshore structures, coastal infrastructure and ecosystems. This paper analyses 10 years of in-situ measurements of significant wave height (Hs) and maximum wave height (Hmax) from the ocean weather ship Polarfront in the Norwegian Sea. During the period 2000 to 2009, surface elevation was recorded every 0.59 s during sampling periods of 30 min. The Hmax observations scale linearly with Hs on average. A widely-used empirical Weibull distribution is found to estimate average values of Hmax/Hs and Hmax better than a Rayleigh distribution, but tends to underestimate both for all but the smallest waves. In this paper we propose a modified Rayleigh distribution which compensates for the heterogeneity of the observed dataset: the distribution is fitted to the whole dataset and improves the estimate of the largest waves. Over the 10-year period, the Weibull distribution approximates the observed Hs and Hmax well, and an exponential function can be used to predict the probability distribution function of the ratio Hmax/Hs. However, the Weibull distribution tends to underestimate the occurrence of extremely large values of Hs and Hmax. The persistence of Hs and Hmax in winter is also examined. Wave fields with Hs>12 m and Hmax>16 m do not last longer than 3 h. Low-to-moderate wave heights that persist for more than 12 h dominate the relationship of the wave field with the winter NAO index over 2000–2009. In contrast, the inter-annual variability of wave fields with Hs>5.5 m or Hmax>8.5 m and wave fields persisting over ~2.5 days is not associated with the winter NAO index.
Resumo:
Wind generation's contribution to supporting peak electricity demand is one of the key questions in wind integration studies. Differently from conventional units, the available outputs of different wind farms cannot be approximated as being statistically independent, and hence near-zero wind output is possible across an entire power system. This paper will review the risk model structures currently used to assess wind's capacity value, along with discussion of the resulting data requirements. A central theme is the benefits from performing statistical estimation of the joint distribution for demand and available wind capacity, focusing attention on uncertainties due to limited histories of wind and demand data; examination of Great Britain data from the last 25 years shows that the data requirements are greater than generally thought. A discussion is therefore presented into how analysis of the types of weather system which have historically driven extreme electricity demands can help to deliver robust insights into wind's contribution to supporting demand, even in the face of such data limitations. The role of the form of the probability distribution for available conventional capacity in driving wind capacity credit results is also discussed.
Resumo:
This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to output a valid probability interval. The methodology is designed for mass spectrometry data. For demonstrative purposes, we applied this methodology to MALDI-TOF data sets in order to predict the diagnosis of heart disease and early diagnoses of ovarian cancer and breast cancer. The experiments showed that probability intervals are narrow, that is, the output of the multiprobability predictor is similar to a single probability distribution. In addition, probability intervals produced for heart disease and ovarian cancer data were more accurate than the output of corresponding probability predictor. When Venn machines were forced to make point predictions, the accuracy of such predictions is for the most data better than the accuracy of the underlying algorithm that outputs single probability distribution of a label. Application of this methodology to MALDI-TOF data sets empirically demonstrates the validity. The accuracy of the proposed method on ovarian cancer data rises from 66.7 % 11 months in advance of the moment of diagnosis to up to 90.2 % at the moment of diagnosis. The same approach has been applied to heart disease data without time dependency, although the achieved accuracy was not as high (up to 69.9 %). The methodology allowed us to confirm mass spectrometry peaks previously identified as carrying statistically significant information for discrimination between controls and cases.
Resumo:
Data from 58 strong-lensing events surveyed by the Sloan Lens ACS Survey are used to estimate the projected galaxy mass inside their Einstein radii by two independent methods: stellar dynamics and strong gravitational lensing. We perform a joint analysis of these two estimates inside models with up to three degrees of freedom with respect to the lens density profile, stellar velocity anisotropy, and line-of-sight (LOS) external convergence, which incorporates the effect of the large-scale structure on strong lensing. A Bayesian analysis is employed to estimate the model parameters, evaluate their significance, and compare models. We find that the data favor Jaffe`s light profile over Hernquist`s, but that any particular choice between these two does not change the qualitative conclusions with respect to the features of the system that we investigate. The density profile is compatible with an isothermal, being sightly steeper and having an uncertainty in the logarithmic slope of the order of 5% in models that take into account a prior ignorance on anisotropy and external convergence. We identify a considerable degeneracy between the density profile slope and the anisotropy parameter, which largely increases the uncertainties in the estimates of these parameters, but we encounter no evidence in favor of an anisotropic velocity distribution on average for the whole sample. An LOS external convergence following a prior probability distribution given by cosmology has a small effect on the estimation of the lens density profile, but can increase the dispersion of its value by nearly 40%.
Resumo:
In this paper, we present a study on a deterministic partially self-avoiding walk (tourist walk), which provides a novel method for texture feature extraction. The method is able to explore an image on all scales simultaneously. Experiments were conducted using different dynamics concerning the tourist walk. A new strategy, based on histograms. to extract information from its joint probability distribution is presented. The promising results are discussed and compared to the best-known methods for texture description reported in the literature. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
We consider bipartitions of one-dimensional extended systems whose probability distribution functions describe stationary states of stochastic models. We define estimators of the information shared between the two subsystems. If the correlation length is finite, the estimators stay finite for large system sizes. If the correlation length diverges, so do the estimators. The definition of the estimators is inspired by information theory. We look at several models and compare the behaviors of the estimators in the finite-size scaling limit. Analytical and numerical methods as well as Monte Carlo simulations are used. We show how the finite-size scaling functions change for various phase transitions, including the case where one has conformal invariance.
Resumo:
Canalizing genes possess such broad regulatory power, and their action sweeps across a such a wide swath of processes that the full set of affected genes are not highly correlated under normal conditions. When not active, the controlling gene will not be predictable to any significant degree by its subject genes, either alone or in groups, since their behavior will be highly varied relative to the inactive controlling gene. When the controlling gene is active, its behavior is not well predicted by any one of its targets, but can be very well predicted by groups of genes under its control. To investigate this question, we introduce in this paper the concept of intrinsically multivariate predictive (IMP) genes, and present a mathematical study of IMP in the context of binary genes with respect to the coefficient of determination (CoD), which measures the predictive power of a set of genes with respect to a target gene. A set of predictor genes is said to be IMP for a target gene if all properly contained subsets of the predictor set are bad predictors of the target but the full predictor set predicts the target with great accuracy. We show that logic of prediction, predictive power, covariance between predictors, and the entropy of the joint probability distribution of the predictors jointly affect the appearance of IMP genes. In particular, we show that high-predictive power, small covariance among predictors, a large entropy of the joint probability distribution of predictors, and certain logics, such as XOR in the 2-predictor case, are factors that favor the appearance of IMP. The IMP concept is applied to characterize the behavior of the gene DUSP1, which exhibits control over a central, process-integrating signaling pathway, thereby providing preliminary evidence that IMP can be used as a criterion for discovery of canalizing genes.