42 resultados para bayesian networks
Resumo:
On-line learning is one of the most powerful and commonly used techniques for training large layered networks and has been used successfully in many real-world applications. Traditional analytical methods have been recently complemented by ones from statistical physics and Bayesian statistics. This powerful combination of analytical methods provides more insight and deeper understanding of existing algorithms and leads to novel and principled proposals for their improvement. This book presents a coherent picture of the state-of-the-art in the theoretical analysis of on-line learning. An introduction relates the subject to other developments in neural networks and explains the overall picture. Surveys by leading experts in the field combine new and established material and enable non-experts to learn more about the techniques and methods used. This book, the first in the area, provides a comprehensive view of the subject and will be welcomed by mathematicians, scientists and engineers, whether in industry or academia.
Resumo:
A practical Bayesian approach for inference in neural network models has been available for ten years, and yet it is not used frequently in medical applications. In this chapter we show how both regularisation and feature selection can bring significant benefits in diagnostic tasks through two case studies: heart arrhythmia classification based on ECG data and the prognosis of lupus. In the first of these, the number of variables was reduced by two thirds without significantly affecting performance, while in the second, only the Bayesian models had an acceptable accuracy. In both tasks, neural networks outperformed other pattern recognition approaches.
Resumo:
In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We show how a Gaussian process with hyper-parameters estimated from Numerical Weather Prediction Models yields meteorologically convincing wind fields. We use neural networks to make local estimates of wind vector probabilities. The resulting inference problem cannot be solved analytically, but Markov Chain Monte Carlo methods allow us to retrieve accurate wind fields.
Resumo:
Online learning is discussed from the viewpoint of Bayesian statistical inference. By replacing the true posterior distribution with a simpler parametric distribution, one can define an online algorithm by a repetition of two steps: An update of the approximate posterior, when a new example arrives, and an optimal projection into the parametric family. Choosing this family to be Gaussian, we show that the algorithm achieves asymptotic efficiency. An application to learning in single layer neural networks is given.
Resumo:
Large monitoring networks are becoming increasingly common and can generate large datasets from thousands to millions of observations in size, often with high temporal resolution. Processing large datasets using traditional geostatistical methods is prohibitively slow and in real world applications different types of sensor can be found across a monitoring network. Heterogeneities in the error characteristics of different sensors, both in terms of distribution and magnitude, presents problems for generating coherent maps. An assumption in traditional geostatistics is that observations are made directly of the underlying process being studied and that the observations are contaminated with Gaussian errors. Under this assumption, sub–optimal predictions will be obtained if the error characteristics of the sensor are effectively non–Gaussian. One method, model based geostatistics, assumes that a Gaussian process prior is imposed over the (latent) process being studied and that the sensor model forms part of the likelihood term. One problem with this type of approach is that the corresponding posterior distribution will be non–Gaussian and computationally demanding as Monte Carlo methods have to be used. An extension of a sequential, approximate Bayesian inference method enables observations with arbitrary likelihoods to be treated, in a projected process kriging framework which is less computationally intensive. The approach is illustrated using a simulated dataset with a range of sensor models and error characteristics.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.
Resumo:
Control design for stochastic uncertain nonlinear systems is traditionally based on minimizing the expected value of a suitably chosen loss function. Moreover, most control methods usually assume the certainty equivalence principle to simplify the problem and make it computationally tractable. We offer an improved probabilistic framework which is not constrained by these previous assumptions, and provides a more natural framework for incorporating and dealing with uncertainty. The focus of this paper is on developing this framework to obtain an optimal control law strategy using a fully probabilistic approach for information extraction from process data, which does not require detailed knowledge of system dynamics. Moreover, the proposed control method framework allows handling the problem of input-dependent noise. A basic paradigm is proposed and the resulting algorithm is discussed. The proposed probabilistic control method is for the general nonlinear class of discrete-time systems. It is demonstrated theoretically on the affine class. A nonlinear simulation example is also provided to validate theoretical development.
Resumo:
In this letter, we derive continuum equations for the generalization error of the Bayesian online algorithm (BOnA) for the one-layer perceptron with a spherical covariance matrix using the Rosenblatt potential and show, by numerical calculations, that the asymptotic performance of the algorithm is the same as the one for the optimal algorithm found by means of variational methods with the added advantage that the BOnA does not use any inaccessible information during learning. © 2007 IEEE.
Resumo:
In the specific area of software engineering (SE) for self-adaptive systems (SASs) there is a growing research awareness about the synergy between SE and artificial intelligence (AI). However, just few significant results have been published so far. In this paper, we propose a novel and formal Bayesian definition of surprise as the basis for quantitative analysis to measure degrees of uncertainty and deviations of self-adaptive systems from normal behavior. A surprise measures how observed data affects the models or assumptions of the world during runtime. The key idea is that a "surprising" event can be defined as one that causes a large divergence between the belief distributions prior to and posterior to the event occurring. In such a case the system may decide either to adapt accordingly or to flag that an abnormal situation is happening. In this paper, we discuss possible applications of Bayesian theory of surprise for the case of self-adaptive systems using Bayesian dynamic decision networks. Copyright © 2014 ACM.
Resumo:
This article proposes a Bayesian neural network approach to determine the risk of re-intervention after endovascular aortic aneurysm repair surgery. The target of proposed technique is to determine which patients have high chance to re-intervention (high-risk patients) and which are not (low-risk patients) after 5 years of the surgery. Two censored datasets relating to the clinical conditions of aortic aneurysms have been collected from two different vascular centers in the United Kingdom. A Bayesian network was first employed to solve the censoring issue in the datasets. Then, a back propagation neural network model was built using the uncensored data of the first center to predict re-intervention on the second center and classify the patients into high-risk and low-risk groups. Kaplan-Meier curves were plotted for each group of patients separately to show whether there is a significant difference between the two risk groups. Finally, the logrank test was applied to determine whether the neural network model was capable of predicting and distinguishing between the two risk groups. The results show that the Bayesian network used for uncensoring the data has improved the performance of the neural networks that were built for the two centers separately. More importantly, the neural network that was trained with uncensored data of the first center was able to predict and discriminate between groups of low risk and high risk of re-intervention after 5 years of endovascular aortic aneurysm surgery at center 2 (p = 0.0037 in the logrank test).
Resumo:
The inverse controller is traditionally assumed to be a deterministic function. This paper presents a pedagogical methodology for estimating the stochastic model of the inverse controller. The proposed method is based on Bayes' theorem. Using Bayes' rule to obtain the stochastic model of the inverse controller allows the use of knowledge of uncertainty from both the inverse and the forward model in estimating the optimal control signal. The paper presents the methodology for general nonlinear systems and is demonstrated on nonlinear single-input-single-output (SISO) and multiple-input-multiple-output (MIMO) examples. © 2006 IEEE.
Resumo:
Lifelong surveillance is not cost-effective after endovascular aneurysm repair (EVAR), but is required to detect aortic complications which are fatal if untreated (type 1/3 endoleak, sac expansion, device migration). Aneurysm morphology determines the probability of aortic complications and therefore the need for surveillance, but existing analyses have proven incapable of identifying patients at sufficiently low risk to justify abandoning surveillance. This study aimed to improve the prediction of aortic complications, through the application of machine-learning techniques. Patients undergoing EVAR at 2 centres were studied from 2004–2010. Aneurysm morphology had previously been studied to derive the SGVI Score for predicting aortic complications. Bayesian Neural Networks were designed using the same data, to dichotomise patients into groups at low- or high-risk of aortic complications. Network training was performed only on patients treated at centre 1. External validation was performed by assessing network performance independently of network training, on patients treated at centre 2. Discrimination was assessed by Kaplan-Meier analysis to compare aortic complications in predicted low-risk versus predicted high-risk patients. 761 patients aged 75 +/− 7 years underwent EVAR in 2 centres. Mean follow-up was 36+/− 20 months. Neural networks were created incorporating neck angu- lation/length/diameter/volume; AAA diameter/area/volume/length/tortuosity; and common iliac tortuosity/diameter. A 19-feature network predicted aor- tic complications with excellent discrimination and external validation (5-year freedom from aortic complications in predicted low-risk vs predicted high-risk patients: 97.9% vs. 63%; p < 0.0001). A Bayesian Neural-Network algorithm can identify patients in whom it may be safe to abandon surveillance after EVAR. This proposal requires prospective study.