Biblioteca Digital

886 resultados para Discrete Gaussian Sampling

PDFOS: PDF estimation based over-sampling for imbalanced two-class problems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier’s structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets.

Characterizing sampling bias in the trace gas climatologies of the SPARC Data Initiative

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monthly zonal mean climatologies of atmospheric measurements from satellite instruments can have biases due to the nonuniform sampling of the atmosphere by the instruments. We characterize potential sampling biases in stratospheric trace gas climatologies of the Stratospheric Processes and Their Role in Climate (SPARC) Data Initiative using chemical fields from a chemistry climate model simulation and sampling patterns from 16 satellite-borne instruments. The exercise is performed for the long-lived stratospheric trace gases O3 and H2O. Monthly sampling biases for O3 exceed 10% for many instruments in the high-latitude stratosphere and in the upper troposphere/lower stratosphere, while annual mean sampling biases reach values of up to 20% in the same regions for some instruments. Sampling biases for H2O are generally smaller than for O3, although still notable in the upper troposphere/lower stratosphere and Southern Hemisphere high latitudes. The most important mechanism leading to monthly sampling bias is nonuniform temporal sampling, i.e., the fact that for many instruments, monthly means are produced from measurements which span less than the full month in question. Similarly, annual mean sampling biases are well explained by nonuniformity in the month-to-month sampling by different instruments. Nonuniform sampling in latitude and longitude are shown to also lead to nonnegligible sampling biases, which are most relevant for climatologies which are otherwise free of biases due to nonuniform temporal sampling.

Environmental sampling of a bell barrow on Horsell Common, Woking

Relevância:

20.00% 20.00%

Publicador:

Observation impact in data assimilation: the effect of non-Gaussian observation error.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data assimilation methods which avoid the assumption of Gaussian error statistics are being developed for geoscience applications. We investigate how the relaxation of the Gaussian assumption affects the impact observations have within the assimilation process. The effect of non-Gaussian observation error (described by the likelihood) is compared to previously published work studying the effect of a non-Gaussian prior. The observation impact is measured in three ways: the sensitivity of the analysis to the observations, the mutual information, and the relative entropy. These three measures have all been studied in the case of Gaussian data assimilation and, in this case, have a known analytical form. It is shown that the analysis sensitivity can also be derived analytically when at least one of the prior or likelihood is Gaussian. This derivation shows an interesting asymmetry in the relationship between analysis sensitivity and analysis error covariance when the two different sources of non-Gaussian structure are considered (likelihood vs. prior). This is illustrated for a simple scalar case and used to infer the effect of the non-Gaussian structure on mutual information and relative entropy, which are more natural choices of metric in non-Gaussian data assimilation. It is concluded that approximating non-Gaussian error distributions as Gaussian can give significantly erroneous estimates of observation impact. The degree of the error depends not only on the nature of the non-Gaussian structure, but also on the metric used to measure the observation impact and the source of the non-Gaussian structure.

Gaussian anamorphosis in the analysis step of the EnKF: a joint state-variable/observation approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The analysis step of the (ensemble) Kalman filter is optimal when (1) the distribution of the background is Gaussian, (2) state variables and observations are related via a linear operator, and (3) the observational error is of additive nature and has Gaussian distribution. When these conditions are largely violated, a pre-processing step known as Gaussian anamorphosis (GA) can be applied. The objective of this procedure is to obtain state variables and observations that better fulfil the Gaussianity conditions in some sense. In this work we analyse GA from a joint perspective, paying attention to the effects of transformations in the joint state variable/observation space. First, we study transformations for state variables and observations that are independent from each other. Then, we introduce a targeted joint transformation with the objective to obtain joint Gaussianity in the transformed space. We focus primarily in the univariate case, and briefly comment on the multivariate one. A key point of this paper is that, when (1)-(3) are violated, using the analysis step of the EnKF will not recover the exact posterior density in spite of any transformations one may perform. These transformations, however, provide approximations of different quality to the Bayesian solution of the problem. Using an example in which the Bayesian posterior can be analytically computed, we assess the quality of the analysis distributions generated after applying the EnKF analysis step in conjunction with different GA options. The value of the targeted joint transformation is particularly clear for the case when the prior is Gaussian, the marginal density for the observations is close to Gaussian, and the likelihood is a Gaussian mixture.

Multiple, discrete arcs on sunward convecting field lines in the 14-15 MLT region

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ionospheric plasma flow measurements and simultaneous observations of thin (∼0.2° invariant latitude (ILAT)), multiple, longitudinally extended auroral arcs of transient nature within 74°-76° ILAT and 1030-1130 UT (∼14-15 MLT) on January 12, 1989, are reported. The auroral structures appeared within the luminous belt of strong 630.0-nm emissions located predominantly on sunward convecting field lines equatorward of the convection reversal boundary as identified by the European Incoherent Scatter UHF radar. The events occurred during a period of several hours quasi-steady solar wind speed (∼ 700 km s−1) and a radially orientated interplanetary magnetic field (IMF) with a weak northward tilt (IMF Bz>0). These typical dayside auroral features are related to previous studies of auroral activity related to the upward region 1 current in the postnoon sector. The discrete auroral events presented here may result from magnetosheath plasma injections into the low-latitude boundary layer (LLBL) and an associated dynamo mechanism. An alternative explanation invokes kinetic Alfvén waves, triggered either by Kelvin-Helmholtz instability at the inner (or outer) edge of the LLBL or by pressure pulse induced magnetopause surface waves.

Estimation of Gaussian process regression model using probability distance measures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new class of parameter estimation algorithms is introduced for Gaussian process regression (GPR) models. It is shown that the integration of the GPR model with probability distance measures of (i) the integrated square error and (ii) Kullback–Leibler (K–L) divergence are analytically tractable. An efficient coordinate descent algorithm is proposed to iteratively estimate the kernel width using golden section search which includes a fast gradient descent algorithm as an inner loop to estimate the noise variance. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.

Gaussian processes autoencoder for dimensionality reduction

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning low dimensional manifold from highly nonlinear data of high dimensionality has become increasingly important for discovering intrinsic representation that can be utilized for data visualization and preprocessing. The autoencoder is a powerful dimensionality reduction technique based on minimizing reconstruction error, and it has regained popularity because it has been efficiently used for greedy pretraining of deep neural networks. Compared to Neural Network (NN), the superiority of Gaussian Process (GP) has been shown in model inference, optimization and performance. GP has been successfully applied in nonlinear Dimensionality Reduction (DR) algorithms, such as Gaussian Process Latent Variable Model (GPLVM). In this paper we propose the Gaussian Processes Autoencoder Model (GPAM) for dimensionality reduction by extending the classic NN based autoencoder to GP based autoencoder. More interestingly, the novel model can also be viewed as back constrained GPLVM (BC-GPLVM) where the back constraint smooth function is represented by a GP. Experiments verify the performance of the newly proposed model.

On-line Gaussian mixture density estimator for adaptive minimum bit-error-rate beamforming receivers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We develop an on-line Gaussian mixture density estimator (OGMDE) in the complex-valued domain to facilitate adaptive minimum bit-error-rate (MBER) beamforming receiver for multiple antenna based space-division multiple access systems. Specifically, the novel OGMDE is proposed to adaptively model the probability density function of the beamformer’s output by tracking the incoming data sample by sample. With the aid of the proposed OGMDE, our adaptive beamformer is capable of updating the beamformer’s weights sample by sample to directly minimize the achievable bit error rate (BER). We show that this OGMDE based MBER beamformer outperforms the existing on-line MBER beamformer, known as the least BER beamformer, in terms of both the convergence speed and the achievable BER.

Two fast radiative transfer methods to improve the temporal sampling of clouds in numerical weather prediction and climate models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The high computational cost of calculating the radiative heating rates in numerical weather prediction (NWP) and climate models requires that calculations are made infrequently, leading to poor sampling of the fast-changing cloud field and a poor representation of the feedback that would occur. This paper presents two related schemes for improving the temporal sampling of the cloud field. Firstly, the ‘split time-stepping’ scheme takes advantage of the independent nature of the monochromatic calculations of the ‘correlated-k’ method to split the calculation into gaseous absorption terms that are highly dependent on changes in cloud (the optically thin terms) and those that are not (optically thick). The small number of optically thin terms can then be calculated more often to capture changes in the grey absorption and scattering associated with cloud droplets and ice crystals. Secondly, the ‘incremental time-stepping’ scheme uses a simple radiative transfer calculation using only one or two monochromatic calculations representing the optically thin part of the atmospheric spectrum. These are found to be sufficient to represent the heating rate increments caused by changes in the cloud field, which can then be added to the last full calculation of the radiation code. We test these schemes in an operational forecast model configuration and find a significant improvement is achieved, for a small computational cost, over the current scheme employed at the Met Office. The ‘incremental time-stepping’ scheme is recommended for operational use, along with a new scheme to correct the surface fluxes for the change in solar zenith angle between radiation calculations.

Using high-frequency water quality data to assess sampling strategies for the EU Water Framework Directive

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The EU Water Framework Directive (WFD) requires that the ecological and chemical status of water bodies in Europe should be assessed, and action taken where possible to ensure that at least "good" quality is attained in each case by 2015. This paper is concerned with the accuracy and precision with which chemical status in rivers can be measured given certain sampling strategies, and how this can be improved. High-frequency (hourly) chemical data from four rivers in southern England were subsampled to simulate different sampling strategies for four parameters used for WFD classification: dissolved phosphorus, dissolved oxygen, pH and water temperature. These data sub-sets were then used to calculate the WFD classification for each site. Monthly sampling was less precise than weekly sampling, but the effect on WFD classification depended on the closeness of the range of concentrations to the class boundaries. In some cases, monthly sampling for a year could result in the same water body being assigned to three or four of the WFD classes with 95% confidence, due to random sampling effects, whereas with weekly sampling this was one or two classes for the same cases. In the most extreme case, the same water body could have been assigned to any of the five WFD quality classes. Weekly sampling considerably reduces the uncertainties compared to monthly sampling. The width of the weekly sampled confidence intervals was about 33% that of the monthly for P species and pH, about 50% for dissolved oxygen, and about 67% for water temperature. For water temperature, which is assessed as the 98th percentile in the UK, monthly sampling biases the mean downwards by about 1 °C compared to the true value, due to problems of assessing high percentiles with limited data. Low-frequency measurements will generally be unsuitable for assessing standards expressed as high percentiles. Confining sampling to the working week compared to all 7 days made little difference, but a modest improvement in precision could be obtained by sampling at the same time of day within a 3 h time window, and this is recommended. For parameters with a strong diel variation, such as dissolved oxygen, the value obtained, and thus possibly the WFD classification, can depend markedly on when in the cycle the sample was taken. Specifying this in the sampling regime would be a straightforward way to improve precision, but there needs to be agreement about how best to characterise risk in different types of river. These results suggest that in some cases it will be difficult to assign accurate WFD chemical classes or to detect likely trends using current sampling regimes, even for these largely groundwater-fed rivers. A more critical approach to sampling is needed to ensure that management actions are appropriate and supported by data.

Discontinuous Galerkin methods for the p-biharmonic equation from a discrete variational perspective

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study discontinuous Galerkin approximations of the p-biharmonic equation for p∈(1,∞) from a variational perspective. We propose a discrete variational formulation of the problem based on an appropriate definition of a finite element Hessian and study convergence of the method (without rates) using a semicontinuity argument. We also present numerical experiments aimed at testing the robustness of the method.

A moment problem for random discrete measures

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Let X be a locally compact Polish space. A random measure on X is a probability measure on the space of all (nonnegative) Radon measures on X. Denote by K(X) the cone of all Radon measures η on X which are of the form η =

Effect of production system, geographic location and sampling date on milk quality parameters

Relevância:

20.00% 20.00%

Publicador:

Molecular design of a discrete chain-folding polyimide for controlled inkjet deposition of supramolecular polymers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A supramolecular polymer based upon two complementary polymer components is formed by sequential deposition from solution in THF, using a piezoelectric drop-on-demand inkjet printer. Highly efficient cycloaddition or ‘click’ chemistry afforded a well-defined poly(ethylene glycol) featuring chain-folding diimide end groups, which possesses greatly enhanced solubility in THF relative to earlier materials featuring random diimide sequences. Blending the new polyimide with a complementary poly(ethylene glycol) system bearing pyrene end groups (which bind to the chain-folding diimide units) overcomes the limited solubility encountered previously with chain-folding polyimides in inkjet printing applications. The solution state properties of the resulting polymer blend were assessed via viscometry to confirm the presence of a supramolecular polymer before depositing the two electronically complementary polymers by inkjet printing techniques. The novel materials so produced offer an insight into ways of controlling the properties of printed materials through tuning the structure of the polymer at the (supra)molecular level.

«
1
2
...
44
45
46
47
48
49
50
...
59
60
»