80 resultados para Probability Distribution Function
em CentAUR: Central Archive University of Reading - UK
Resumo:
Using the classical Parzen window estimate as the target function, the kernel density estimation is formulated as a regression problem and the orthogonal forward regression technique is adopted to construct sparse kernel density estimates. The proposed algorithm incrementally minimises a leave-one-out test error score to select a sparse kernel model, and a local regularisation method is incorporated into the density construction process to further enforce sparsity. The kernel weights are finally updated using the multiplicative nonnegative quadratic programming algorithm, which has the ability to reduce the model size further. Except for the kernel width, the proposed algorithm has no other parameters that need tuning, and the user is not required to specify any additional criterion to terminate the density construction procedure. Two examples are used to demonstrate the ability of this regression-based approach to effectively construct a sparse kernel density estimate with comparable accuracy to that of the full-sample optimised Parzen window density estimate.
Resumo:
A new sparse kernel probability density function (pdf) estimator based on zero-norm constraint is constructed using the classical Parzen window (PW) estimate as the target function. The so-called zero-norm of the parameters is used in order to achieve enhanced model sparsity, and it is suggested to minimize an approximate function of the zero-norm. It is shown that under certain condition, the kernel weights of the proposed pdf estimator based on the zero-norm approximation can be updated using the multiplicative nonnegative quadratic programming algorithm. Numerical examples are employed to demonstrate the efficacy of the proposed approach.
Resumo:
We develop a new sparse kernel density estimator using a forward constrained regression framework, within which the nonnegative and summing-to-unity constraints of the mixing weights can easily be satisfied. Our main contribution is to derive a recursive algorithm to select significant kernels one at time based on the minimum integrated square error (MISE) criterion for both the selection of kernels and the estimation of mixing weights. The proposed approach is simple to implement and the associated computational cost is very low. Specifically, the complexity of our algorithm is in the order of the number of training data N, which is much lower than the order of N2 offered by the best existing sparse kernel density estimators. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with comparable accuracy to those of the classical Parzen window estimate and other existing sparse kernel density estimators.
Resumo:
A new sparse kernel density estimator is introduced. Our main contribution is to develop a recursive algorithm for the selection of significant kernels one at time using the minimum integrated square error (MISE) criterion for both kernel selection. The proposed approach is simple to implement and the associated computational cost is very low. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.
Resumo:
A stochastic parameterization scheme for deep convection is described, suitable for use in both climate and NWP models. Theoretical arguments and the results of cloud-resolving models, are discussed in order to motivate the form of the scheme. In the deterministic limit, it tends to a spectrum of entraining/detraining plumes and is similar to other current parameterizations. The stochastic variability describes the local fluctuations about a large-scale equilibrium state. Plumes are drawn at random from a probability distribution function (pdf) that defines the chance of finding a plume of given cloud-base mass flux within each model grid box. The normalization of the pdf is given by the ensemble-mean mass flux, and this is computed with a CAPE closure method. The characteristics of each plume produced are determined using an adaptation of the plume model from the Kain-Fritsch parameterization. Initial tests in the single column version of the Unified Model verify that the scheme is effective in producing the desired distributions of convective variability without adversely affecting the mean state.
Resumo:
A direct method is presented for determining the uncertainty in reservoir pressure, flow, and net present value (NPV) using the time-dependent, one phase, two- or three-dimensional equations of flow through a porous medium. The uncertainty in the solution is modelled as a probability distribution function and is computed from given statistical data for input parameters such as permeability. The method generates an expansion for the mean of the pressure about a deterministic solution to the system equations using a perturbation to the mean of the input parameters. Hierarchical equations that define approximations to the mean solution at each point and to the field covariance of the pressure are developed and solved numerically. The procedure is then used to find the statistics of the flow and the risked value of the field, defined by the NPV, for a given development scenario. This method involves only one (albeit complicated) solution of the equations and contrasts with the more usual Monte-Carlo approach where many such solutions are required. The procedure is applied easily to other physical systems modelled by linear or nonlinear partial differential equations with uncertain data.
Resumo:
Large waves pose risks to ships, offshore structures, coastal infrastructure and ecosystems. This paper analyses 10 years of in-situ measurements of significant wave height (Hs) and maximum wave height (Hmax) from the ocean weather ship Polarfront in the Norwegian Sea. During the period 2000 to 2009, surface elevation was recorded every 0.59 s during sampling periods of 30 min. The Hmax observations scale linearly with Hs on average. A widely-used empirical Weibull distribution is found to estimate average values of Hmax/Hs and Hmax better than a Rayleigh distribution, but tends to underestimate both for all but the smallest waves. In this paper we propose a modified Rayleigh distribution which compensates for the heterogeneity of the observed dataset: the distribution is fitted to the whole dataset and improves the estimate of the largest waves. Over the 10-year period, the Weibull distribution approximates the observed Hs and Hmax well, and an exponential function can be used to predict the probability distribution function of the ratio Hmax/Hs. However, the Weibull distribution tends to underestimate the occurrence of extremely large values of Hs and Hmax. The persistence of Hs and Hmax in winter is also examined. Wave fields with Hs>12 m and Hmax>16 m do not last longer than 3 h. Low-to-moderate wave heights that persist for more than 12 h dominate the relationship of the wave field with the winter NAO index over 2000–2009. In contrast, the inter-annual variability of wave fields with Hs>5.5 m or Hmax>8.5 m and wave fields persisting over ~2.5 days is not associated with the winter NAO index.
Resumo:
The co-polar correlation coefficient (ρhv) has many applications, including hydrometeor classification, ground clutter and melting layer identification, interpretation of ice microphysics and the retrieval of rain drop size distributions (DSDs). However, we currently lack the quantitative error estimates that are necessary if these applications are to be fully exploited. Previous error estimates of ρhv rely on knowledge of the unknown "true" ρhv and implicitly assume a Gaussian probability distribution function of ρhv samples. We show that frequency distributions of ρhv estimates are in fact highly negatively skewed. A new variable: L = -log10(1 - ρhv) is defined, which does have Gaussian error statistics, and a standard deviation depending only on the number of independent radar pulses. This is verified using observations of spherical drizzle drops, allowing, for the first time, the construction of rigorous confidence intervals in estimates of ρhv. In addition, we demonstrate how the imperfect co-location of the horizontal and vertical polarisation sample volumes may be accounted for. The possibility of using L to estimate the dispersion parameter (µ) in the gamma drop size distribution is investigated. We find that including drop oscillations is essential for this application, otherwise there could be biases in retrieved µ of up to ~8. Preliminary results in rainfall are presented. In a convective rain case study, our estimates show µ to be substantially larger than 0 (an exponential DSD). In this particular rain event, rain rate would be overestimated by up to 50% if a simple exponential DSD is assumed.
Resumo:
This paper explores a new technique to calculate and plot the distribution of instantaneous transmit envelope power of OFDMA and SC-FDMA signals from the equation of Probability Density Function (PDF) solved numerically. The Complementary Cumulative Distribution Function (CCDF) of Instantaneous Power to Average Power Ratio (IPAPR) is computed from the structure of the transmit system matrix. This helps intuitively understand the distribution of output signal power if the structure of the transmit system matrix and the constellation used are known. The distribution obtained for OFDMA signal matches complex normal distribution. The results indicate why the CCDF of IPAPR in case of SC-FDMA is better than OFDMA for a given constellation. Finally, with this method it is shown again that cyclic prefixed DS-CDMA system is one case with optimum IPAPR. The insight that this technique provides may be useful in designing area optimised digital and power efficient analogue modules.
Resumo:
The continuous ranked probability score (CRPS) is a frequently used scoring rule. In contrast with many other scoring rules, the CRPS evaluates cumulative distribution functions. An ensemble of forecasts can easily be converted into a piecewise constant cumulative distribution function with steps at the ensemble members. This renders the CRPS a convenient scoring rule for the evaluation of ‘raw’ ensembles, obviating the need for sophisticated ensemble model output statistics or dressing methods prior to evaluation. In this article, a relation between the CRPS score and the quantile score is established. The evaluation of ‘raw’ ensembles using the CRPS is discussed in this light. It is shown that latent in this evaluation is an interpretation of the ensemble as quantiles but with non-uniform levels. This needs to be taken into account if the ensemble is evaluated further, for example with rank histograms.
Resumo:
A generalized or tunable-kernel model is proposed for probability density function estimation based on an orthogonal forward regression procedure. Each stage of the density estimation process determines a tunable kernel, namely, its center vector and diagonal covariance matrix, by minimizing a leave-one-out test criterion. The kernel mixing weights of the constructed sparse density estimate are finally updated using the multiplicative nonnegative quadratic programming algorithm to ensure the nonnegative and unity constraints, and this weight-updating process additionally has the desired ability to further reduce the model size. The proposed tunable-kernel model has advantages, in terms of model generalization capability and model sparsity, over the standard fixed-kernel model that restricts kernel centers to the training data points and employs a single common kernel variance for every kernel. On the other hand, it does not optimize all the model parameters together and thus avoids the problems of high-dimensional ill-conditioned nonlinear optimization associated with the conventional finite mixture model. Several examples are included to demonstrate the ability of the proposed novel tunable-kernel model to effectively construct a very compact density estimate accurately.
Resumo:
A sampling oscilloscope is one of the main units in automatic pulse measurement system (APMS). The time jitter in waveform samplers is an important error source that affect the precision of data acquisition. In this paper, this kind of error is greatly reduced by using the deconvolution method. First, the probability density function (PDF) of time jitter distribution is determined by the statistical approach, then, this PDF is used as convolution kern to deconvolve with the acquired waveform data with additional averaging, and the result is the waveform data in which the effect of time jitter has been removed, and the measurement precision of APMS is greatly improved. In addition, some computer simulations are given which prove the success of the method given in this paper.
Resumo:
The translation of an ensemble of model runs into a probability distribution is a common task in model-based prediction. Common methods for such ensemble interpretations proceed as if verification and ensemble were draws from the same underlying distribution, an assumption not viable for most, if any, real world ensembles. An alternative is to consider an ensemble as merely a source of information rather than the possible scenarios of reality. This approach, which looks for maps between ensembles and probabilistic distributions, is investigated and extended. Common methods are revisited, and an improvement to standard kernel dressing, called ‘affine kernel dressing’ (AKD), is introduced. AKD assumes an affine mapping between ensemble and verification, typically not acting on individual ensemble members but on the entire ensemble as a whole, the parameters of this mapping are determined in parallel with the other dressing parameters, including a weight assigned to the unconditioned (climatological) distribution. These amendments to standard kernel dressing, albeit simple, can improve performance significantly and are shown to be appropriate for both overdispersive and underdispersive ensembles, unlike standard kernel dressing which exacerbates over dispersion. Studies are presented using operational numerical weather predictions for two locations and data from the Lorenz63 system, demonstrating both effectiveness given operational constraints and statistical significance given a large sample.
Resumo:
A method is presented to calculate economic optimum fungicide doses accounting for the risk-aversion of growers responding to variability in disease severity between crops. Simple dose-response and disease-yield loss functions are used to estimate net disease-related costs (fungicide cost, plus disease-induced yield loss) as a function of dose and untreated severity. With fairly general assumptions about the shapes of the probability distribution of disease severity and the other functions involved, we show that a choice of fungicide dose which minimises net costs on average across seasons results in occasional large net costs caused by inadequate control in high disease seasons. This may be unacceptable to a grower with limited capital. A risk-averse grower can choose to reduce the size and frequency of such losses by applying a higher dose as insurance. For example, a grower may decide to accept ‘high loss’ years one year in ten or one year in twenty (i.e. specifying a proportion of years in which disease severity and net costs will be above a specified level). Our analysis shows that taking into account disease severity variation and risk-aversion will usually increase the dose applied by an economically rational grower. The analysis is illustrated with data on septoria tritici leaf blotch of wheat caused by Mycosphaerella graminicola. Observations from untreated field plots at sites across England over three years were used to estimate the probability distribution of disease severities at mid-grain filling. In the absence of a fully reliable disease forecasting scheme, reducing the frequency of ‘high loss’ years requires substantially higher doses to be applied to all crops. Disease resistant cultivars reduce both the optimal dose at all levels of risk and the disease-related costs at all doses.