992 resultados para sequential Gaussian simulation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We discuss the Application of TAP mean field methods known from Statistical Mechanics of disordered systems to Bayesian classification with Gaussian processes. In contrast to previous applications, no knowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler 'naive' mean field theory and support vector machines (SVM) as limiting cases. For both mean field algorithms and support vectors machines, simulation results for three small benchmark data sets are presented. They show 1. that one may get state of the art performance by using the leave-one-out estimator for model selection and 2. the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The latter result is a taken as a strong support for the internal consistency of the mean field approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this chapter, we elaborate on the well-known relationship between Gaussian processes (GP) and Support Vector Machines (SVM). Secondly, we present approximate solutions for two computational problems arising in GP and SVM. The first one is the calculation of the posterior mean for GP classifiers using a `naive' mean field approach. The second one is a leave-one-out estimator for the generalization error of SVM based on a linear response method. Simulation results on a benchmark dataset show similar performances for the GP mean field algorithm and the SVM algorithm. The approximate leave-one-out estimator is found to be in very good agreement with the exact leave-one-out error.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A CSSL- type modular FORTRAN package, called ACES, has been developed to assist in the simulation of the dynamic behaviour of chemical plant. ACES can be harnessed, for instance, to simulate the transients in startups or after a throughput change. ACES has benefited from two existing simulators. The structure was adapted from ICL SLAM and most plant models originate in DYFLO. The latter employs sequential modularisation which is not always applicable to chemical engineering problems. A novel device of twice- round execution enables ACES to achieve general simultaneous modularisation. During the FIRST ROUND, STATE-VARIABLES are retrieved from the integrator and local calculations performed. During the SECOND ROUND, fresh derivatives are estimated and stored for simultaneous integration. ACES further includes a version of DIFSUB, a variable-step integrator capable of handling stiff differential systems. ACES is highly formalised . It does not use pseudo steady- state approximations and excludes inconsistent and arbitrary features of DYFLO. Built- in debug traps make ACES robust. ACES shows generality, flexibility, versatility and portability, and is very convenient to use. It undertakes substantial housekeeping behind the scenes and thus minimises the detailed involvement of the user. ACES provides a working set of defaults for simulation to proceed as far as possible. Built- in interfaces allow for reactions and user supplied algorithms to be incorporated . New plant models can be easily appended. Boundary- value problems and optimisation may be tackled using the RERUN feature. ACES is file oriented; a STATE can be saved in a readable form and reactivated later. Thus piecewise simulation is possible. ACES has been illustrated and verified to a large extent using some literature-based examples. Actual plant tests are desirable however to complete the verification of the library. Interaction and graphics are recommended for future work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models, or simulators, are widely used in a range of scientific fields to aid understanding of the processes involved and make predictions. Such simulators are often computationally demanding and are thus not amenable to statistical analysis. Emulators provide a statistical approximation, or surrogate, for the simulators accounting for the additional approximation uncertainty. This thesis develops a novel sequential screening method to reduce the set of simulator variables considered during emulation. This screening method is shown to require fewer simulator evaluations than existing approaches. Utilising the lower dimensional active variable set simplifies subsequent emulation analysis. For random output, or stochastic, simulators the output dispersion, and thus variance, is typically a function of the inputs. This work extends the emulator framework to account for such heteroscedasticity by constructing two new heteroscedastic Gaussian process representations and proposes an experimental design technique to optimally learn the model parameters. The design criterion is an extension of Fisher information to heteroscedastic variance models. Replicated observations are efficiently handled in both the design and model inference stages. Through a series of simulation experiments on both synthetic and real world simulators, the emulators inferred on optimal designs with replicated observations are shown to outperform equivalent models inferred on space-filling replicate-free designs in terms of both model parameter uncertainty and predictive variance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Boyd's SBS model which includes distributed thermal acoustic noise (DTAN) has been enhanced to enable the Stokes-spontaneous density depletion noise (SSDDN) component of the transmitted optical field to be simulated, probably for the first time, as well as the full transmitted field. SSDDN would not be generated from previous SBS models in which a Stokes seed replaces DTAN. SSDDN becomes the dominant form of transmitted SBS noise as model fibre length (MFL) is increased but its optical power spectrum remains independent of MFL. Simulations of the full transmitted field and SSDDN for different MFLs allow prediction of the optical power spectrum, or system performance parameters which depend on this, for typical communication link lengths which are too long for direct simulation. The SBS model has also been innovatively improved by allowing the Brillouin Shift Frequency (BS) to vary over the model fibre length, for the nonuniform fibre model (NFM) mode, or to remain constant, for the uniform fibre model (UFM) mode. The assumption of a Gaussian probability density function (pdf) for the BSF in the NFM has been confirmed by means of an analysis of reported Brillouin amplified power spectral measurements for the simple case of a nominally step-index single-mode pure silica core fibre. The BSF pdf could be modified to match the Brillouin gain spectra of other fibre types if required. For both models, simulated backscattered and output powers as functions of input power agree well with those from a reported experiment for fitting Brillouin gain coefficients close to theoretical. The NFM and UFM Brillouin gain spectra are then very similar from half to full maximum but diverge at lower values. Consequently, NFM and UFM transmitted SBS noise powers inferred for long MFLs differ by 1-2 dB over the input power range of 0.15 dBm. This difference could be significant for AM-VSB CATV links at some channel frequencies. The modelled characteristic of Carrier-to-Noise Ratio (CNR) as a function of input power for a single intensity modulated subcarrier is in good agreement with the characteristic reported for an experiment when either the UFM or NFM is used. The difference between the two modelled characteristics would have been more noticeable for a higher fibre length or a lower subcarrier frequency.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principled statistical application of Gaussian random field models used in geostatistics has historically been limited to data sets of a small size. This limitation is imposed by the requirement to store and invert the covariance matrix of all the samples to obtain a predictive distribution at unsampled locations, or to use likelihood-based covariance estimation. Various ad hoc approaches to solve this problem have been adopted, such as selecting a neighborhood region and/or a small number of observations to use in the kriging process, but these have no sound theoretical basis and it is unclear what information is being lost. In this article, we present a Bayesian method for estimating the posterior mean and covariance structures of a Gaussian random field using a sequential estimation algorithm. By imposing sparsity in a well-defined framework, the algorithm retains a subset of “basis vectors” that best represent the “true” posterior Gaussian random field model in the relative entropy sense. This allows a principled treatment of Gaussian random field models on very large data sets. The method is particularly appropriate when the Gaussian random field model is regarded as a latent variable model, which may be nonlinearly related to the observations. We show the application of the sequential, sparse Bayesian estimation in Gaussian random field models and discuss its merits and drawbacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SPOT simulation imagery was acquired for a test site in the Forest of Dean in Gloucestershire, U.K. This data was qualitatively and quantitatively evaluated for its potential application in forest resource mapping and management. A variety of techniques are described for enhancing the image with the aim of providing species level discrimination within the forest. Visual interpretation of the imagery was more successful than automated classification. The heterogeneity within the forest classes, and in particular between the forest and urban class, resulted in poor discrimination using traditional `per-pixel' automated methods of classification. Different means of assessing classification accuracy are proposed. Two techniques for measuring textural variation were investigated in an attempt to improve classification accuracy. The first of these, a sequential segmentation method, was found to be beneficial. The second, a parallel segmentation method, resulted in little improvement though this may be related to a combination of resolution in size of the texture extraction area. The effect on classification accuracy of combining the SPOT simulation imagery with other data types is investigated. A grid cell encoding technique was selected as most appropriate for storing digitised topographic (elevation, slope) and ground truth data. Topographic data were shown to improve species-level classification, though with sixteen classes overall accuracies were consistently below 50%. Neither sub-division into age groups or the incorporation of principal components and a band ratio significantly improved classification accuracy. It is concluded that SPOT imagery will not permit species level classification within forested areas as diverse as the Forest of Dean. The imagery will be most useful as part of a multi-stage sampling scheme. The use of texture analysis is highly recommended for extracting maximum information content from the data. Incorporation of the imagery into a GIS will both aid discrimination and provide a useful management tool.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since wind has an intrinsically complex and stochastic nature, accurate wind power forecasts are necessary for the safety and economics of wind energy utilization. In this paper, we investigate a combination of numeric and probabilistic models: one-day-ahead wind power forecasts were made with Gaussian Processes (GPs) applied to the outputs of a Numerical Weather Prediction (NWP) model. Firstly the wind speed data from NWP was corrected by a GP. Then, as there is always a defined limit on power generated in a wind turbine due the turbine controlling strategy, a Censored GP was used to model the relationship between the corrected wind speed and power output. To validate the proposed approach, two real world datasets were used for model construction and testing. The simulation results were compared with the persistence method and Artificial Neural Networks (ANNs); the proposed model achieves about 11% improvement in forecasting accuracy (Mean Absolute Error) compared to the ANN model on one dataset, and nearly 5% improvement on another.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The advances in three related areas of state-space modeling, sequential Bayesian learning, and decision analysis are addressed, with the statistical challenges of scalability and associated dynamic sparsity. The key theme that ties the three areas is Bayesian model emulation: solving challenging analysis/computational problems using creative model emulators. This idea defines theoretical and applied advances in non-linear, non-Gaussian state-space modeling, dynamic sparsity, decision analysis and statistical computation, across linked contexts of multivariate time series and dynamic networks studies. Examples and applications in financial time series and portfolio analysis, macroeconomics and internet studies from computational advertising demonstrate the utility of the core methodological innovations.

Chapter 1 summarizes the three areas/problems and the key idea of emulating in those areas. Chapter 2 discusses the sequential analysis of latent threshold models with use of emulating models that allows for analytical filtering to enhance the efficiency of posterior sampling. Chapter 3 examines the emulator model in decision analysis, or the synthetic model, that is equivalent to the loss function in the original minimization problem, and shows its performance in the context of sequential portfolio optimization. Chapter 4 describes the method for modeling the steaming data of counts observed on a large network that relies on emulating the whole, dependent network model by independent, conjugate sub-models customized to each set of flow. Chapter 5 reviews those advances and makes the concluding remarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this thesis is to review and augment the theory and methods of optimal experimental design. In Chapter I the scene is set by considering the possible aims of an experimenter prior to an experiment, the statistical methods one might use to achieve those aims and how experimental design might aid this procedure. It is indicated that, given a criterion for design, a priori optimal design will only be possible in certain instances and, otherwise, some form of sequential procedure would seem to be indicated. In Chapter 2 an exact experimental design problem is formulated mathematically and is compared with its continuous analogue. Motivation is provided for the solution of this continuous problem, and the remainder of the chapter concerns this problem. A necessary and sufficient condition for optimality of a design measure is given. Problems which might arise in testing this condition are discussed, in particular with respect to possible non-differentiability of the criterion function at the design being tested. Several examples are given of optimal designs which may be found analytically and which illustrate the points discussed earlier in the chapter. In Chapter 3 numerical methods of solution of the continuous optimal design problem are reviewed. A new algorithm is presented with illustrations of how it should be used in practice. It is shown that, for reasonably large sample size, continuously optimal designs may be approximated to well by an exact design. In situations where this is not satisfactory algorithms for improvement of this design are reviewed. Chapter 4 consists of a discussion of sequentially designed experiments, with regard to both the philosophies underlying, and the application of the methods of, statistical inference. In Chapter 5 we criticise constructively previous suggestions for fully sequential design procedures. Alternative suggestions are made along with conjectures as to how these might improve performance. Chapter 6 presents a simulation study, the aim of which is to investigate the conjectures of Chapter 5. The results of this study provide empirical support for these conjectures. In Chapter 7 examples are analysed. These suggest aids to sequential experimentation by means of reduction of the dimension of the design space and the possibility of experimenting semi-sequentially. Further examples are considered which stress the importance of the use of prior information in situations of this type. Finally we consider the design of experiments when semi-sequential experimentation is mandatory because of the necessity of taking batches of observations at the same time. In Chapter 8 we look at some of the assumptions which have been made and indicate what may go wrong where these assumptions no longer hold.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, the energy response functions of a CdTe detector were obtained by Monte Carlo (MC) simulation in the energy range from 5 to 160keV, using the PENELOPE code. In the response calculations the carrier transport features and the detector resolution were included. The computed energy response function was validated through comparison with experimental results obtained with (241)Am and (152)Eu sources. In order to investigate the influence of the correction by the detector response at diagnostic energy range, x-ray spectra were measured using a CdTe detector (model XR-100T, Amptek), and then corrected by the energy response of the detector using the stripping procedure. Results showed that the CdTe exhibits good energy response at low energies (below 40keV), showing only small distortions on the measured spectra. For energies below about 80keV, the contribution of the escape of Cd- and Te-K x-rays produce significant distortions on the measured x-ray spectra. For higher energies, the most important correction is the detector efficiency and the carrier trapping effects. The results showed that, after correction by the energy response, the measured spectra are in good agreement with those provided by a theoretical model of the literature. Finally, our results showed that the detailed knowledge of the response function and a proper correction procedure are fundamental for achieving more accurate spectra from which quality parameters (i.e., half-value layer and homogeneity coefficient) can be determined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We perform variational studies of the interaction-localization problem to describe the interaction-induced renormalizations of the effective (screened) random potential seen by quasiparticles. Here we present results of careful finite-size scaling studies for the conductance of disordered Hubbard chains at half-filling and zero temperature. While our results indicate that quasiparticle wave functions remain exponentially localized even in the presence of moderate to strong repulsive interactions, we show that interactions produce a strong decrease of the characteristic conductance scale g^{*} signaling the crossover to strong localization. This effect, which cannot be captured by a simple renormalization of the disorder strength, instead reflects a peculiar non-Gaussian form of the spatial correlations of the screened disordered potential, a hitherto neglected mechanism to dramatically reduce the impact of Anderson localization (interference) effects.