70 resultados para Bayesian priors


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The assessment of the reliability of systems which learn from data is a key issue to investigate thoroughly before the actual application of information processing techniques to real-world problems. Over the recent years Gaussian processes and Bayesian neural networks have come to the fore and in this thesis their generalisation capabilities are analysed from theoretical and empirical perspectives. Upper and lower bounds on the learning curve of Gaussian processes are investigated in order to estimate the amount of data required to guarantee a certain level of generalisation performance. In this thesis we analyse the effects on the bounds and the learning curve induced by the smoothness of stochastic processes described by four different covariance functions. We also explain the early, linearly-decreasing behaviour of the curves and we investigate the asymptotic behaviour of the upper bounds. The effect of the noise and the characteristic lengthscale of the stochastic process on the tightness of the bounds are also discussed. The analysis is supported by several numerical simulations. The generalisation error of a Gaussian process is affected by the dimension of the input vector and may be decreased by input-variable reduction techniques. In conventional approaches to Gaussian process regression, the positive definite matrix estimating the distance between input points is often taken diagonal. In this thesis we show that a general distance matrix is able to estimate the effective dimensionality of the regression problem as well as to discover the linear transformation from the manifest variables to the hidden-feature space, with a significant reduction of the input dimension. Numerical simulations confirm the significant superiority of the general distance matrix with respect to the diagonal one.In the thesis we also present an empirical investigation of the generalisation errors of neural networks trained by two Bayesian algorithms, the Markov Chain Monte Carlo method and the evidence framework; the neural networks have been trained on the task of labelling segmented outdoor images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diagnosing faults in wastewater treatment, like diagnosis of most problems, requires bi-directional plausible reasoning. This means that both predictive (from causes to symptoms) and diagnostic (from symptoms to causes) inferences have to be made, depending on the evidence available, in reasoning for the final diagnosis. The use of computer technology for the purpose of diagnosing faults in the wastewater process has been explored, and a rule-based expert system was initiated. It was found that such an approach has serious limitations in its ability to reason bi-directionally, which makes it unsuitable for diagnosing tasks under the conditions of uncertainty. The probabilistic approach known as Bayesian Belief Networks (BBNS) was then critically reviewed, and was found to be well-suited for diagnosis under uncertainty. The theory and application of BBNs are outlined. A full-scale BBN for the diagnosis of faults in a wastewater treatment plant based on the activated sludge system has been developed in this research. Results from the BBN show good agreement with the predictions of wastewater experts. It can be concluded that the BBNs are far superior to rule-based systems based on certainty factors in their ability to diagnose faults and predict systems in complex operating systems having inherently uncertain behaviour.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis addresses data assimilation, which typically refers to the estimation of the state of a physical system given a model and observations, and its application to short-term precipitation forecasting. A general introduction to data assimilation is given, both from a deterministic and' stochastic point of view. Data assimilation algorithms are reviewed, in the static case (when no dynamics are involved), then in the dynamic case. A double experiment on two non-linear models, the Lorenz 63 and the Lorenz 96 models, is run and the comparative performance of the methods is discussed in terms of quality of the assimilation, robustness "in the non-linear regime and computational time. Following the general review and analysis, data assimilation is discussed in the particular context of very short-term rainfall forecasting (nowcasting) using radar images. An extended Bayesian precipitation nowcasting model is introduced. The model is stochastic in nature and relies on the spatial decomposition of the rainfall field into rain "cells". Radar observations are assimilated using a Variational Bayesian method in which the true posterior distribution of the parameters is approximated by a more tractable distribution. The motion of the cells is captured by a 20 Gaussian process. The model is tested on two precipitation events, the first dominated by convective showers, the second by precipitation fronts. Several deterministic and probabilistic validation methods are applied and the model is shown to retain reasonable prediction skill at up to 3 hours lead time. Extensions to the model are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Control design for stochastic uncertain nonlinear systems is traditionally based on minimizing the expected value of a suitably chosen loss function. Moreover, most control methods usually assume the certainty equivalence principle to simplify the problem and make it computationally tractable. We offer an improved probabilistic framework which is not constrained by these previous assumptions, and provides a more natural framework for incorporating and dealing with uncertainty. The focus of this paper is on developing this framework to obtain an optimal control law strategy using a fully probabilistic approach for information extraction from process data, which does not require detailed knowledge of system dynamics. Moreover, the proposed control method framework allows handling the problem of input-dependent noise. A basic paradigm is proposed and the resulting algorithm is discussed. The proposed probabilistic control method is for the general nonlinear class of discrete-time systems. It is demonstrated theoretically on the affine class. A nonlinear simulation example is also provided to validate theoretical development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semisupervised learning algorithms such as SVMand it also performs better than local learning without incorporating class priors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a comparative study of three closely related Bayesian models for unsupervised document level sentiment classification, namely, the latent sentiment model (LSM), the joint sentiment-topic (JST) model, and the Reverse-JST model. Extensive experiments have been conducted on two corpora, the movie review dataset and the multi-domain sentiment dataset. It has been found that while all the three models achieve either better or comparable performance on these two corpora when compared to the existing unsupervised sentiment classification approaches, both JST and Reverse-JST are able to extract sentiment-oriented topics. In addition, Reverse-JST always performs worse than JST suggesting that the JST model is more appropriate for joint sentiment topic detection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the proliferation of social media sites, social streams have proven to contain the most up-to-date information on current events. Therefore, it is crucial to extract events from the social streams such as tweets. However, it is not straightforward to adapt the existing event extraction systems since texts in social media are fragmented and noisy. In this paper we propose a simple and yet effective Bayesian model, called Latent Event Model (LEM), to extract structured representation of events from social media. LEM is fully unsupervised and does not require annotated data for training. We evaluate LEM on a Twitter corpus. Experimental results show that the proposed model achieves 83% in F-measure, and outperforms the state-of-the-art baseline by over 7%.© 2014 Association for Computational Linguistics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The twin arginine translocation (TAT) system ferries folded proteins across the bacterial membrane. Proteins are directed into this system by the TAT signal peptide present at the amino terminus of the precursor protein, which contains the twin arginine residues that give the system its name. There are currently only two computational methods for the prediction of TAT translocated proteins from sequence. Both methods have limitations that make the creation of a new algorithm for TAT-translocated protein prediction desirable. We have developed TATPred, a new sequence-model method, based on a Nave-Bayesian network, for the prediction of TAT signal peptides. In this approach, a comprehensive range of models was tested to identify the most reliable and robust predictor. The best model comprised 12 residues: three residues prior to the twin arginines and the seven residues that follow them. We found a prediction sensitivity of 0.979 and a specificity of 0.942.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Calibration of stochastic traffic microsimulation models is a challenging task. This paper proposes a fast iterative probabilistic precalibration framework and demonstrates how it can be successfully applied to a real-world traffic simulation model of a section of the M40 motorway and its surrounding area in the U.K. The efficiency of the method stems from the use of emulators of the stochastic microsimulator, which provides fast surrogates of the traffic model. The use of emulators minimizes the number of microsimulator runs required, and the emulators' probabilistic construction allows for the consideration of the extra uncertainty introduced by the approximation. It is shown that automatic precalibration of this real-world microsimulator, using turn-count observational data, is possible, considering all parameters at once, and that this precalibrated microsimulator improves on the fit to observations compared with the traditional expertly tuned microsimulation. © 2000-2011 IEEE.