948 resultados para BAYESIAN NETWORK
Resumo:
Current methods for retrieving near surface winds from scatterometer observations over the ocean surface require a foward sensor model which maps the wind vector to the measured backscatter. This paper develops a hybrid neural network forward model, which retains the physical understanding embodied in ¸mod, but incorporates greater flexibility, allowing a better fit to the observations. By introducing a separate model for the mid-beam and using a common model for the fore- and aft-beams, we show a significant improvement in local wind vector retrieval. The hybrid model also fits the scatterometer observations more closely. The model is trained in a Bayesian framework, accounting for the noise on the wind vector inputs. We show that adding more high wind speed observations in the training set improves wind vector retrieval at high wind speeds without compromising performance at medium or low wind speeds.
Resumo:
A practical Bayesian approach for inference in neural network models has been available for ten years, and yet it is not used frequently in medical applications. In this chapter we show how both regularisation and feature selection can bring significant benefits in diagnostic tasks through two case studies: heart arrhythmia classification based on ECG data and the prognosis of lupus. In the first of these, the number of variables was reduced by two thirds without significantly affecting performance, while in the second, only the Bayesian models had an acceptable accuracy. In both tasks, neural networks outperformed other pattern recognition approaches.
Resumo:
The ERS-1 satellite carries a scatterometer which measures the amount of radiation scattered back toward the satellite by the ocean's surface. These measurements can be used to infer wind vectors. The implementation of a neural network based forward model which maps wind vectors to radar backscatter is addressed. Input noise cannot be neglected. To account for this noise, a Bayesian framework is adopted. However, Markov Chain Monte Carlo sampling is too computationally expensive. Instead, gradient information is used with a non-linear optimisation algorithm to find the maximum em a posteriori probability values of the unknown variables. The resulting models are shown to compare well with the current operational model when visualised in the target space.
Resumo:
Current methods for retrieving near-surface winds from scatterometer observations over the ocean surface require a forward sensor model which maps the wind vector to the measured backscatter. This paper develops a hybrid neural network forward model, which retains the physical understanding embodied in CMOD4, but incorporates greater flexibility, allowing a better fit to the observations. By introducing a separate model for the midbeam and using a common model for the fore and aft beams, we show a significant improvement in local wind vector retrieval. The hybrid model also fits the scatterometer observations more closely. The model is trained in a Bayesian framework, accounting for the noise on the wind vector inputs. We show that adding more high wind speed observations in the training set improves wind vector retrieval at high wind speeds without compromising performance at medium or low wind speeds. Copyright 2001 by the American Geophysical Union.
Resumo:
It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural network framework which accounts for input noise provided that a model of the noise process exists. In the limit where the noise process is small and symmetric it is shown, using the Laplace approximation, that this method adds an extra term to the usual Bayesian error bar which depends on the variance of the input noise process. Further, by treating the true (noiseless) input as a hidden variable, and sampling this jointly with the network’s weights, using a Markov chain Monte Carlo method, it is demonstrated that it is possible to infer the regression over the noiseless input. This leads to the possibility of training an accurate model of a system using less accurate, or more uncertain, data. This is demonstrated on both the, synthetic, noisy sine wave problem and a real problem of inferring the forward model for a satellite radar backscatter system used to predict sea surface wind vectors.
Resumo:
The subject of this thesis is the n-tuple net.work (RAMnet). The major advantage of RAMnets is their speed and the simplicity with which they can be implemented in parallel hardware. On the other hand, this method is not a universal approximator and the training procedure does not involve the minimisation of a cost function. Hence RAMnets are potentially sub-optimal. It is important to understand the source of this sub-optimality and to develop the analytical tools that allow us to quantify the generalisation cost of using this model for any given data. We view RAMnets as classifiers and function approximators and try to determine how critical their lack of' universality and optimality is. In order to understand better the inherent. restrictions of the model, we review RAMnets showing their relationship to a number of well established general models such as: Associative Memories, Kamerva's Sparse Distributed Memory, Radial Basis Functions, General Regression Networks and Bayesian Classifiers. We then benchmark binary RAMnet. model against 23 other algorithms using real-world data from the StatLog Project. This large scale experimental study indicates that RAMnets are often capable of delivering results which are competitive with those obtained by more sophisticated, computationally expensive rnodels. The Frequency Weighted version is also benchmarked and shown to perform worse than the binary RAMnet for large values of the tuple size n. We demonstrate that the main issues in the Frequency Weighted RAMnets is adequate probability estimation and propose Good-Turing estimates in place of the more commonly used :Maximum Likelihood estimates. Having established the viability of the method numerically, we focus on providillg an analytical framework that allows us to quantify the generalisation cost of RAMnets for a given datasetL. For the classification network we provide a semi-quantitative argument which is based on the notion of Tuple distance. It gives a good indication of whether the network will fail for the given data. A rigorous Bayesian framework with Gaussian process prior assumptions is given for the regression n-tuple net. We show how to calculate the generalisation cost of this net and verify the results numerically for one dimensional noisy interpolation problems. We conclude that the n-tuple method of classification based on memorisation of random features can be a powerful alternative to slower cost driven models. The speed of the method is at the expense of its optimality. RAMnets will fail for certain datasets but the cases when they do so are relatively easy to determine with the analytical tools we provide.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.
Resumo:
This thesis proposes a novel graphical model for inference called the Affinity Network,which displays the closeness between pairs of variables and is an alternative to Bayesian Networks and Dependency Networks. The Affinity Network shares some similarities with Bayesian Networks and Dependency Networks but avoids their heuristic and stochastic graph construction algorithms by using a message passing scheme. A comparison with the above two instances of graphical models is given for sparse discrete and continuous medical data and data taken from the UCI machine learning repository. The experimental study reveals that the Affinity Network graphs tend to be more accurate on the basis of an exhaustive search with the small datasets. Moreover, the graph construction algorithm is faster than the other two methods with huge datasets. The Affinity Network is also applied to data produced by a synchronised system. A detailed analysis and numerical investigation into this dynamical system is provided and it is shown that the Affinity Network can be used to characterise its emergent behaviour even in the presence of noise.
Resumo:
Studies assume that socioeconomic status determines individuals’ states of health, but how does health determine socioeconomic status? And how does this association vary depending on contextual differences? To answer this question, our study uses an additive Bayesian Networks model to explain the interrelationships between health and socioeconomic determinants using complex and messy data. This model has been used to find the most probable structure in a network to describe the interdependence of these factors in five European welfare state regimes. The advantage of this study is that it offers a specific picture to describe the complex interrelationship between socioeconomic determinants and health, producing a network that is controlled by socio demographic factors such as gender and age. The present work provides a general framework to describe and understand the complex association between socioeconomic determinants and health.
Resumo:
The advances in three related areas of state-space modeling, sequential Bayesian learning, and decision analysis are addressed, with the statistical challenges of scalability and associated dynamic sparsity. The key theme that ties the three areas is Bayesian model emulation: solving challenging analysis/computational problems using creative model emulators. This idea defines theoretical and applied advances in non-linear, non-Gaussian state-space modeling, dynamic sparsity, decision analysis and statistical computation, across linked contexts of multivariate time series and dynamic networks studies. Examples and applications in financial time series and portfolio analysis, macroeconomics and internet studies from computational advertising demonstrate the utility of the core methodological innovations.
Chapter 1 summarizes the three areas/problems and the key idea of emulating in those areas. Chapter 2 discusses the sequential analysis of latent threshold models with use of emulating models that allows for analytical filtering to enhance the efficiency of posterior sampling. Chapter 3 examines the emulator model in decision analysis, or the synthetic model, that is equivalent to the loss function in the original minimization problem, and shows its performance in the context of sequential portfolio optimization. Chapter 4 describes the method for modeling the steaming data of counts observed on a large network that relies on emulating the whole, dependent network model by independent, conjugate sub-models customized to each set of flow. Chapter 5 reviews those advances and makes the concluding remarks.
Resumo:
Network security monitoring remains a challenge. As global networks scale up, in terms of traffic, volume and speed, effective attribution of cyber attacks is increasingly difficult. The problem is compounded by a combination of other factors, including the architecture of the Internet, multi-stage attacks and increasing volumes of nonproductive traffic. This paper proposes to shift the focus of security monitoring from the source to the target. Simply put, resources devoted to detection and attribution should be redeployed to efficiently monitor for targeting and prevention of attacks. The effort of detection should aim to determine whether a node is under attack, and if so, effectively prevent the attack. This paper contributes by systematically reviewing the structural, operational and legal reasons underlying this argument, and presents empirical evidence to support a shift away from attribution to favour of a target-centric monitoring approach. A carefully deployed set of experiments are presented and a detailed analysis of the results is achieved.
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Requirement engineering is a key issue in the development of a software project. Like any other development activity it is not without risks. This work is about the empirical study of risks of requirements by applying machine learning techniques, specifically Bayesian networks classifiers. We have defined several models to predict the risk level for a given requirement using three dataset that collect metrics taken from the requirement specifications of different projects. The classification accuracy of the Bayesian models obtained is evaluated and compared using several classification performance measures. The results of the experiments show that the Bayesians networks allow obtaining valid predictors. Specifically, a tree augmented network structure shows a competitive experimental performance in all datasets. Besides, the relations established between the variables collected to determine the level of risk in a requirement, match with those set by requirement engineers. We show that Bayesian networks are valid tools for the automation of risks assessment in requirement engineering.
Resumo:
International audience