910 resultados para Bayesian maximum entropy
Resumo:
The main purpose of this study is to present an alternative benchmarking approach that can be used by national regulators of utilities. It is widely known that the lack of sizeable data sets limits the choice of the benchmarking method and the specification of the model to set price controls within incentive-based regulation. Ill-posed frontier models are the problem that some national regulators have been facing. Maximum entropy estimators are useful in the estimation of such ill-posed models, in particular in models exhibiting small sample sizes, collinearity and non-normal errors, as well as in models where the number of parameters to be estimated exceeds the number of observations available. The empirical study involves a sample data used by the Portuguese regulator of the electricity sector to set the parameters for the electricity distribution companies in the regulatory period of 2012-2014. DEA and maximum entropy methods are applied and the efficiency results are compared.
Resumo:
The current climate crisis requires a comprehensive understanding of biodiversity to acknowledge how ecosystems’ responses to anthropogenic disturbances may result in feedback that can either mitigate or exacerbate global warming. Although ecosystems are dynamic and macroecological patterns change drastically in response to disturbance, dynamic macroecology has received insufficient attention and theoretical formalisation. In this context, the maximum entropy principle (MaxEnt) could provide an effective inference procedure to study ecosystems. Since the improper usage of entropy outside its scope often leads to misconceptions, the opening chapter will clarify its meaning by following its evolution from classical thermodynamics to information theory. The second chapter introduces the study of ecosystems from a physicist’s viewpoint. In particular, the MaxEnt Theory of Ecology (METE) will be the cornerstone of the discussion. METE predicts the shapes of macroecological metrics in relatively static ecosystems using constraints imposed by static state variables. However, in disturbed ecosystems with macroscale state variables that change rapidly over time, its predictions tend to fail. In the final chapter, DynaMETE is therefore presented as an extension of METE from static to dynamic. By predicting how macroecological patterns are likely to change in response to perturbations, DynaMETE can contribute to a better understanding of disturbed ecosystems’ fate and the improvement of conservation and management of carbon sinks, like forests. Targeted strategies in ecosystem management are now indispensable to enhance the interdependence of human well-being and the health of ecosystems, thus avoiding climate change tipping points.
Resumo:
The vast territories that have been radioactively contaminated during the 1986 Chernobyl accident provide a substantial data set of radioactive monitoring data, which can be used for the verification and testing of the different spatial estimation (prediction) methods involved in risk assessment studies. Using the Chernobyl data set for such a purpose is motivated by its heterogeneous spatial structure (the data are characterized by large-scale correlations, short-scale variability, spotty features, etc.). The present work is concerned with the application of the Bayesian Maximum Entropy (BME) method to estimate the extent and the magnitude of the radioactive soil contamination by 137Cs due to the Chernobyl fallout. The powerful BME method allows rigorous incorporation of a wide variety of knowledge bases into the spatial estimation procedure leading to informative contamination maps. Exact measurements (?hard? data) are combined with secondary information on local uncertainties (treated as ?soft? data) to generate science-based uncertainty assessment of soil contamination estimates at unsampled locations. BME describes uncertainty in terms of the posterior probability distributions generated across space, whereas no assumption about the underlying distribution is made and non-linear estimators are automatically incorporated. Traditional estimation variances based on the assumption of an underlying Gaussian distribution (analogous, e.g., to the kriging variance) can be derived as a special case of the BME uncertainty analysis. The BME estimates obtained using hard and soft data are compared with the BME estimates obtained using only hard data. The comparison involves both the accuracy of the estimation maps using the exact data and the assessment of the associated uncertainty using repeated measurements. Furthermore, a comparison of the spatial estimation accuracy obtained by the two methods was carried out using a validation data set of hard data. Finally, a separate uncertainty analysis was conducted that evaluated the ability of the posterior probabilities to reproduce the distribution of the raw repeated measurements available in certain populated sites. The analysis provides an illustration of the improvement in mapping accuracy obtained by adding soft data to the existing hard data and, in general, demonstrates that the BME method performs well both in terms of estimation accuracy as well as in terms estimation error assessment, which are both useful features for the Chernobyl fallout study.
Resumo:
The quality of environmental data analysis and propagation of errors are heavily affected by the representativity of the initial sampling design [CRE 93, DEU 97, KAN 04a, LEN 06, MUL07]. Geostatistical methods such as kriging are related to field samples, whose spatial distribution is crucial for the correct detection of the phenomena. Literature about the design of environmental monitoring networks (MN) is widespread and several interesting books have recently been published [GRU 06, LEN 06, MUL 07] in order to clarify the basic principles of spatial sampling design (monitoring networks optimization) based on Support Vector Machines was proposed. Nonetheless, modelers often receive real data coming from environmental monitoring networks that suffer from problems of non-homogenity (clustering). Clustering can be related to the preferential sampling or to the impossibility of reaching certain regions.
Resumo:
Pós-graduação em Geociências e Meio Ambiente - IGCE
Resumo:
2010 Mathematics Subject Classification: 94A17.
Resumo:
We present a novel approach to the inference of spectral functions from Euclidean time correlator data that makes close contact with modern Bayesian concepts. Our method differs significantly from the maximum entropy method (MEM). A new set of axioms is postulated for the prior probability, leading to an improved expression, which is devoid of the asymptotically flat directions present in the Shanon-Jaynes entropy. Hyperparameters are integrated out explicitly, liberating us from the Gaussian approximations underlying the evidence approach of the maximum entropy method. We present a realistic test of our method in the context of the nonperturbative extraction of the heavy quark potential. Based on hard-thermal-loop correlator mock data, we establish firm requirements in the number of data points and their accuracy for a successful extraction of the potential from lattice QCD. Finally we reinvestigate quenched lattice QCD correlators from a previous study and provide an improved potential estimation at T2.33TC.
Resumo:
We present a novel approach for the reconstruction of spectra from Euclidean correlator data that makes close contact to modern Bayesian concepts. It is based upon an axiomatically justified dimensionless prior distribution, which in the case of constant prior function m(ω) only imprints smoothness on the reconstructed spectrum. In addition we are able to analytically integrate out the only relevant overall hyper-parameter α in the prior, removing the necessity for Gaussian approximations found e.g. in the Maximum Entropy Method. Using a quasi-Newton minimizer and high-precision arithmetic, we are then able to find the unique global extremum of P[ρ|D] in the full Nω » Nτ dimensional search space. The method actually yields gradually improving reconstruction results if the quality of the supplied input data increases, without introducing artificial peak structures, often encountered in the MEM. To support these statements we present mock data analyses for the case of zero width delta peaks and more realistic scenarios, based on the perturbative Euclidean Wilson Loop as well as the Wilson Line correlator in Coulomb gauge.
Resumo:
Two probabilistic interpretations of the n-tuple recognition method are put forward in order to allow this technique to be analysed with the same Bayesian methods used in connection with other neural network models. Elementary demonstrations are then given of the use of maximum likelihood and maximum entropy methods for tuning the model parameters and assisting their interpretation. One of the models can be used to illustrate the significance of overlapping n-tuple samples with respect to correlations in the patterns.
Resumo:
Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.
Resumo:
An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method
Resumo:
Background: The imatinib trough plasma concentration (C(min)) correlates with clinical response in cancer patients. Therapeutic drug monitoring (TDM) of plasma C(min) is therefore suggested. In practice, however, blood sampling for TDM is often not performed at trough. The corresponding measurement is thus only remotely informative about C(min) exposure. Objectives: The objectives of this study were to improve the interpretation of randomly measured concentrations by using a Bayesian approach for the prediction of C(min), incorporating correlation between pharmacokinetic parameters, and to compare the predictive performance of this method with alternative approaches, by comparing predictions with actual measured trough levels, and with predictions obtained by a reference method, respectively. Methods: A Bayesian maximum a posteriori (MAP) estimation method accounting for correlation (MAP-ρ) between pharmacokinetic parameters was developed on the basis of a population pharmacokinetic model, which was validated on external data. Thirty-one paired random and trough levels, observed in gastrointestinal stromal tumour patients, were then used for the evaluation of the Bayesian MAP-ρ method: individual C(min) predictions, derived from single random observations, were compared with actual measured trough levels for assessment of predictive performance (accuracy and precision). The method was also compared with alternative approaches: classical Bayesian MAP estimation assuming uncorrelated pharmacokinetic parameters, linear extrapolation along the typical elimination constant of imatinib, and non-linear mixed-effects modelling (NONMEM) first-order conditional estimation (FOCE) with interaction. Predictions of all methods were finally compared with 'best-possible' predictions obtained by a reference method (NONMEM FOCE, using both random and trough observations for individual C(min) prediction). Results: The developed Bayesian MAP-ρ method accounting for correlation between pharmacokinetic parameters allowed non-biased prediction of imatinib C(min) with a precision of ±30.7%. This predictive performance was similar for the alternative methods that were applied. The range of relative prediction errors was, however, smallest for the Bayesian MAP-ρ method and largest for the linear extrapolation method. When compared with the reference method, predictive performance was comparable for all methods. The time interval between random and trough sampling did not influence the precision of Bayesian MAP-ρ predictions. Conclusion: Clinical interpretation of randomly measured imatinib plasma concentrations can be assisted by Bayesian TDM. Classical Bayesian MAP estimation can be applied even without consideration of the correlation between pharmacokinetic parameters. Individual C(min) predictions are expected to vary less through Bayesian TDM than linear extrapolation. Bayesian TDM could be developed in the future for other targeted anticancer drugs and for the prediction of other pharmacokinetic parameters that have been correlated with clinical outcomes.