102 resultados para Bayesian mixture model


Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider whether survey respondents’ probability distributions, reported as histograms, provide reliable and coherent point predictions, when viewed through the lens of a Bayesian learning model. We argue that a role remains for eliciting directly-reported point predictions in surveys of professional forecasters.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Clustering methods are increasingly being applied to residential smart meter data, providing a number of important opportunities for distribution network operators (DNOs) to manage and plan the low voltage networks. Clustering has a number of potential advantages for DNOs including, identifying suitable candidates for demand response and improving energy profile modelling. However, due to the high stochasticity and irregularity of household level demand, detailed analytics are required to define appropriate attributes to cluster. In this paper we present in-depth analysis of customer smart meter data to better understand peak demand and major sources of variability in their behaviour. We find four key time periods in which the data should be analysed and use this to form relevant attributes for our clustering. We present a finite mixture model based clustering where we discover 10 distinct behaviour groups describing customers based on their demand and their variability. Finally, using an existing bootstrapping technique we show that the clustering is reliable. To the authors knowledge this is the first time in the power systems literature that the sample robustness of the clustering has been tested.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background Underweight and severe and morbid obesity are associated with highly elevated risks of adverse health outcomes. We estimated trends in mean body-mass index (BMI), which characterises its population distribution, and in the prevalences of a complete set of BMI categories for adults in all countries. Methods We analysed, with use of a consistent protocol, population-based studies that had measured height and weight in adults aged 18 years and older. We applied a Bayesian hierarchical model to these data to estimate trends from 1975 to 2014 in mean BMI and in the prevalences of BMI categories (<18·5 kg/m2 [underweight], 18·5 kg/m2 to <20 kg/m2, 20 kg/m2 to <25 kg/m2, 25 kg/m2 to <30 kg/m2, 30 kg/m2 to <35 kg/m2, 35 kg/m2 to <40 kg/m2, ≥40 kg/m2 [morbid obesity]), by sex in 200 countries and territories, organised in 21 regions. We calculated the posterior probability of meeting the target of halting by 2025 the rise in obesity at its 2010 levels, if post-2000 trends continue. Findings We used 1698 population-based data sources, with more than 19·2 million adult participants (9·9 million men and 9·3 million women) in 186 of 200 countries for which estimates were made. Global age-standardised mean BMI increased from 21·7 kg/m2 (95% credible interval 21·3–22·1) in 1975 to 24·2 kg/m2 (24·0–24·4) in 2014 in men, and from 22·1 kg/m2 (21·7–22·5) in 1975 to 24·4 kg/m2 (24·2–24·6) in 2014 in women. Regional mean BMIs in 2014 for men ranged from 21·4 kg/m2 in central Africa and south Asia to 29·2 kg/m2 (28·6–29·8) in Polynesia and Micronesia; for women the range was from 21·8 kg/m2 (21·4–22·3) in south Asia to 32·2 kg/m2 (31·5–32·8) in Polynesia and Micronesia. Over these four decades, age-standardised global prevalence of underweight decreased from 13·8% (10·5–17·4) to 8·8% (7·4–10·3) in men and from 14·6% (11·6–17·9) to 9·7% (8·3–11·1) in women. South Asia had the highest prevalence of underweight in 2014, 23·4% (17·8–29·2) in men and 24·0% (18·9–29·3) in women. Age-standardised prevalence of obesity increased from 3·2% (2·4–4·1) in 1975 to 10·8% (9·7–12·0) in 2014 in men, and from 6·4% (5·1–7·8) to 14·9% (13·6–16·1) in women. 2·3% (2·0–2·7) of the world's men and 5·0% (4·4–5·6) of women were severely obese (ie, have BMI ≥35 kg/m2). Globally, prevalence of morbid obesity was 0·64% (0·46–0·86) in men and 1·6% (1·3–1·9) in women. Interpretation If post-2000 trends continue, the probability of meeting the global obesity target is virtually zero. Rather, if these trends continue, by 2025, global obesity prevalence will reach 18% in men and surpass 21% in women; severe obesity will surpass 6% in men and 9% in women. Nonetheless, underweight remains prevalent in the world's poorest regions, especially in south Asia.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion combining local component analysis for the finite mixture model. We start with a Parzen window estimator which has the Gaussian kernels with a common covariance matrix, the local component analysis is initially applied to find the covariance matrix using expectation maximization algorithm. Since the constraint on the mixing coefficients of a finite mixture model is on the multinomial manifold, we then use the well-known Riemannian trust-region algorithm to find the set of sparse mixing coefficients. The first and second order Riemannian geometry of the multinomial manifold are utilized in the Riemannian trust-region algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with competitive accuracy to existing kernel density estimators.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A single habit parameterization for the shortwave optical properties of cirrus is presented. The parameterization utilizes a hollow particle geometry, with stepped internal cavities as identified in laboratory and field studies. This particular habit was chosen as both experimental and theoretical results show that the particle exhibits lower asymmetry parameters when compared to solid crystals of the same aspect ratio. The aspect ratio of the particle was varied as a function of maximum dimension, D, in order to adhere to the same physical relationships assumed in the microphysical scheme in a configuration of the Met Office atmosphere-only global model, concerning particle mass, size and effective density. Single scattering properties were then computed using T-Matrix, Ray Tracing with Diffraction on Facets (RTDF) and Ray Tracing (RT) for small, medium, and large size parameters respectively. The scattering properties were integrated over 28 particle size distributions as used in the microphysical scheme. The fits were then parameterized as simple functions of Ice Water Content (IWC) for 6 shortwave bands. The parameterization was implemented into the GA6 configuration of the Met Office Unified Model along with the current operational long-wave parameterization. The GA6 configuration is used to simulate the annual twenty-year short-wave (SW) fluxes at top-of-atmosphere (TOA) and also the temperature and humidity structure of the atmosphere. The parameterization presented here is compared against the current operational model and a more recent habit mixture model.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In survival analysis frailty is often used to model heterogeneity between individuals or correlation within clusters. Typically frailty is taken to be a continuous random effect, yielding a continuous mixture distribution for survival times. A Bayesian analysis of a correlated frailty model is discussed in the context of inverse Gaussian frailty. An MCMC approach is adopted and the deviance information criterion is used to compare models. As an illustration of the approach a bivariate data set of corneal graft survival times is analysed. (C) 2006 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study addresses three issues: spatial downscaling, calibration, and combination of seasonal predictions produced by different coupled ocean-atmosphere climate models. It examines the feasibility Of using a Bayesian procedure for producing combined, well-calibrated downscaled seasonal rainfall forecasts for two regions in South America and river flow forecasts for the Parana river in the south of Brazil and the Tocantins river in the north of Brazil. These forecasts are important for national electricity generation management and planning. A Bayesian procedure, referred to here as forecast assimilation, is used to combine and calibrate the rainfall predictions produced by three climate models. Forecast assimilation is able to improve the skill of 3-month lead November-December-January multi-model rainfall predictions over the two South American regions. Improvements are noted in forecast seasonal mean values and uncertainty estimates. River flow forecasts are less skilful than rainfall forecasts. This is partially because natural river flow is a derived quantity that is sensitive to hydrological as well as meteorological processes, and to human intervention in the form of reservoir management.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, the mixed logit (ML) using Bayesian methods was employed to examine willingness-to-pay (WTP) to consume bread produced with reduced levels of pesticides so as to ameliorate environmental quality, from data generated by a choice experiment. Model comparison used the marginal likelihood, which is preferable for Bayesian model comparison and testing. Models containing constant and random parameters for a number of distributions were considered, along with models in ‘preference space’ and ‘WTP space’ as well as those allowing for misreporting. We found: strong support for the ML estimated in WTP space; little support for fixing the price coefficient a common practice advocated and adopted in the environmental economics literature; and, weak evidence for misreporting.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A Bayesian method of classifying observations that are assumed to come from a number of distinct subpopulations is outlined. The method is illustrated with simulated data and applied to the classification of farms according to their level and variability of income. The resultant classification shows a greater diversity of technical charactersitics within farm types than is conventionally the case. The range of mean farm income between groups in the new classification is wider than that of the conventional method and the variability of income within groups is narrower. Results show that the highest income group in 2000 included large specialist dairy farmers and pig and poultry producers, whilst in 2001 it included large and small specialist dairy farms and large mixed dairy and arable farms. In both years the lowest income group is dominated by non-milk producing livestock farms.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree.