7 resultados para Fisher information

em Aston University Research Archive


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The focus of this thesis is the extension of topographic visualisation mappings to allow for the incorporation of uncertainty. Few visualisation algorithms in the literature are capable of mapping uncertain data with fewer able to represent observation uncertainties in visualisations. As such, modifications are made to NeuroScale, Locally Linear Embedding, Isomap and Laplacian Eigenmaps to incorporate uncertainty in the observation and visualisation spaces. The proposed mappings are then called Normally-distributed NeuroScale (N-NS), T-distributed NeuroScale (T-NS), Probabilistic LLE (PLLE), Probabilistic Isomap (PIso) and Probabilistic Weighted Neighbourhood Mapping (PWNM). These algorithms generate a probabilistic visualisation space with each latent visualised point transformed to a multivariate Gaussian or T-distribution, using a feed-forward RBF network. Two types of uncertainty are then characterised dependent on the data and mapping procedure. Data dependent uncertainty is the inherent observation uncertainty. Whereas, mapping uncertainty is defined by the Fisher Information of a visualised distribution. This indicates how well the data has been interpolated, offering a level of ‘surprise’ for each observation. These new probabilistic mappings are tested on three datasets of vectorial observations and three datasets of real world time series observations for anomaly detection. In order to visualise the time series data, a method for analysing observed signals and noise distributions, Residual Modelling, is introduced. The performance of the new algorithms on the tested datasets is compared qualitatively with the latent space generated by the Gaussian Process Latent Variable Model (GPLVM). A quantitative comparison using existing evaluation measures from the literature allows performance of each mapping function to be compared. Finally, the mapping uncertainty measure is combined with NeuroScale to build a deep learning classifier, the Cascading RBF. This new structure is tested on the MNist dataset achieving world record performance whilst avoiding the flaws seen in other Deep Learning Machines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Natural gradient learning is an efficient and principled method for improving on-line learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out efficient inversion and study the efficacy of a single step estimation of the Fisher information matrix. We analyse the proposed algorithm in a two-layer network, using a statistical mechanics framework which allows us to describe analytically the learning dynamics, and compare performance with true natural gradient learning and standard gradient descent.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computer models, or simulators, are widely used in a range of scientific fields to aid understanding of the processes involved and make predictions. Such simulators are often computationally demanding and are thus not amenable to statistical analysis. Emulators provide a statistical approximation, or surrogate, for the simulators accounting for the additional approximation uncertainty. This thesis develops a novel sequential screening method to reduce the set of simulator variables considered during emulation. This screening method is shown to require fewer simulator evaluations than existing approaches. Utilising the lower dimensional active variable set simplifies subsequent emulation analysis. For random output, or stochastic, simulators the output dispersion, and thus variance, is typically a function of the inputs. This work extends the emulator framework to account for such heteroscedasticity by constructing two new heteroscedastic Gaussian process representations and proposes an experimental design technique to optimally learn the model parameters. The design criterion is an extension of Fisher information to heteroscedastic variance models. Replicated observations are efficiently handled in both the design and model inference stages. Through a series of simulation experiments on both synthetic and real world simulators, the emulators inferred on optimal designs with replicated observations are shown to outperform equivalent models inferred on space-filling replicate-free designs in terms of both model parameter uncertainty and predictive variance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a greedy Bayesian experimental design criterion for heteroscedastic Gaussian process models. The criterion is based on the Fisher information and is optimal in the sense of minimizing parameter uncertainty for likelihood based estimators. We demonstrate the validity of the criterion under different noise regimes and present experimental results from a rabies simulator to demonstrate the effectiveness of the resulting approximately optimal designs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Optimal design for parameter estimation in Gaussian process regression models with input-dependent noise is examined. The motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators to act as statistical surrogates. In the case of stochastic simulators, which produce a random output for a given set of model inputs, repeated evaluations are useful, supporting the use of replicate observations in the experimental design. The findings are also applicable to the wider context of experimental design for Gaussian process regression and kriging. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates. A heteroscedastic Gaussian process model is presented which allows for an experimental design technique based on an extension of Fisher information to heteroscedastic models. It is empirically shown that the error of the approximation of the parameter variance by the inverse of the Fisher information is reduced as the number of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology stochastic simulator, optimal designs with replicate observations are shown to outperform space-filling designs both with and without replicate observations. Guidance is provided on best practice for optimal experimental design for stochastic response models. © 2013 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Currently, no review has been completed regarding the information-gathering process for the provision of medicines for self-medication in community pharmacies in developing countries. Objective: To review the rate of information gathering and the types of information gathered when patients present for self-medication requests. Methods: Six databases were searched for studies that described the rate of information gathering and/or the types of information gathered in the provision of medicines for self-medication in community pharmacies in developing countries. The types of information reported were classified as: signs and symptoms, patient identity, action taken, medications, medical history, and others. Results: Twenty-two studies met the inclusion criteria. Variations in the study populations, types of scenarios, research methods, and data reporting were observed. The reported rate of information gathering varied from 18% to 97%, depending on the research methods used. Information on signs and symptoms and patient identity was more frequently reported to be gathered compared with information on action taken, medications, and medical history. Conclusion: Evidence showed that the information-gathering process for the provision of medicines for self-medication via community pharmacies in developing countries is inconsistent. There is a need to determine the barriers to appropriate information-gathering practice as well as to develop strategies to implement effective information-gathering processes. It is also recommended that international and national pharmacy organizations, including pharmacy academics and pharmacy researchers, develop a consensus on the types of information that should be reported in the original studies. This will facilitate comparison across studies so that areas that need improvement can be identified. © 2013 Elsevier Inc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Descriptions of vegetation communities are often based on vague semantic terms describing species presence and dominance. For this reason, some researchers advocate the use of fuzzy sets in the statistical classification of plant species data into communities. In this study, spatially referenced vegetation abundance values collected from Greek phrygana were analysed by ordination (DECORANA), and classified on the resulting axes using fuzzy c-means to yield a point data-set representing local memberships in characteristic plant communities. The fuzzy clusters matched vegetation communities noted in the field, which tended to grade into one another, rather than occupying discrete patches. The fuzzy set representation of the community exploited the strengths of detrended correspondence analysis while retaining richer information than a TWINSPAN classification of the same data. Thus, in the absence of phytosociological benchmarks, meaningful and manageable habitat information could be derived from complex, multivariate species data. We also analysed the influence of the reliability of different surveyors' field observations by multiple sampling at a selected sample location. We show that the impact of surveyor error was more severe in the Boolean than the fuzzy classification. © 2007 Springer.