891 resultados para Bayesian risk prediction models
Resumo:
The prognostic value of exercise (EXE) and dobutamine echocardiograms (DbE) has been well defined in large studies. However, while risk is determined by both clinical and echo features, no simple means of combining these data has been defined. We sought to combine these data into risk scores. Methods. At 3 expert centers, 7650 pts underwent standard EXE (n=5211) and DbE (w2439) for evaluation of known or suspected CAD and were followed for up to 10 years (mean 5-2) for major events (death or myocardial infarction). A subgroup of 2953 EXE and 1025 DbE pts was randomly selected to develop separate multivariate models for prediction of events. After simplication of each model for clinical use, models were validated in the remaining EXE and DbE pts. ResuI1s. The total number of events was 200 in the EXE and 225 in the DbE pts, of which 58 and 99 events occurred in the respective testing groups. The following regression equations gave equivalent results I” the testing and validation groups for both EXE and DbE; DbE = (Age’O.02) + (DM’l .O) + (Low RPP’0.6) + ([CHF+lschemia+Scar]‘O.7) EXE = ([DM+CHF]‘O.S) + O.S(lschemla #) + l.B(Scar#) - (METS0.19) (where each categorical variable scored 1 when present and 0 when absent, Ischemia# = 1 for l-2 VD. 6 for 3 VD; Scar# = 1 for 1-2 VD, 1.7 for 3 VD). The table summarizes the scores and equivalent outcomes for EXE and DbE. Conclusions. Risk scores based on clinical and EXE or DbE results may be used to quantify the risk of events during follow-up.
Resumo:
The retrieval of wind fields from scatterometer observations has traditionally been separated into two phases; local wind vector retrieval and ambiguity removal. Operationally, a forward model relating wind vector to backscatter is inverted, typically using look up tables, to retrieve up to four local wind vector solutions. A heuristic procedure, using numerical weather prediction forecast wind vectors and, often, some neighbourhood comparison is then used to select the correct solution. In this paper we develop a Bayesian method for wind field retrieval, and show how a direct local inverse model, relating backscatter to wind vector, improves the wind vector retrieval accuracy. We compare these results with the operational U.K. Meteorological Office retrievals, our own CMOD4 retrievals and a neural network based local forward model retrieval. We suggest that the neural network based inverse model, which is extremely fast to use, improves upon current forward models when used in a variational data assimilation scheme.
Resumo:
The problem of regression under Gaussian assumptions is treated generally. The relationship between Bayesian prediction, regularization and smoothing is elucidated. The ideal regression is the posterior mean and its computation scales as O(n3), where n is the sample size. We show that the optimal m-dimensional linear model under a given prior is spanned by the first m eigenfunctions of a covariance operator, which is a trace-class operator. This is an infinite dimensional analogue of principal component analysis. The importance of Hilbert space methods to practical statistics is also discussed.
Resumo:
The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems.
Resumo:
We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world datasets indicate the efficiency of the approach.
Resumo:
In the analysis and prediction of many real-world time series, the assumption of stationarity is not valid. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We introduce a new model which combines a dynamic switching (controlled by a hidden Markov model) and a non-linear dynamical system. We show how to train this hybrid model in a maximum likelihood approach and evaluate its performance on both synthetic and financial data.
Resumo:
The thesis presents a two-dimensional Risk Assessment Method (RAM) where the assessment of risk to the groundwater resources incorporates both the quantification of the probability of the occurrence of contaminant source terms, as well as the assessment of the resultant impacts. The approach emphasizes the need for a greater dependency on the potential pollution sources, rather than the traditional approach where assessment is based mainly on the intrinsic geo-hydrologic parameters. The risk is calculated using Monte Carlo simulation methods whereby random pollution events were generated to the same distribution as historically occurring events or a priori potential probability distribution. Integrated mathematical models then simulate contaminant concentrations at the predefined monitoring points within the aquifer. The spatial and temporal distributions of the concentrations were calculated from repeated realisations, and the number of times when a user defined concentration magnitude was exceeded is quantified as a risk. The method was setup by integrating MODFLOW-2000, MT3DMS and a FORTRAN coded risk model, and automated, using a DOS batch processing file. GIS software was employed in producing the input files and for the presentation of the results. The functionalities of the method, as well as its sensitivities to the model grid sizes, contaminant loading rates, length of stress periods, and the historical frequencies of occurrence of pollution events were evaluated using hypothetical scenarios and a case study. Chloride-related pollution sources were compiled and used as indicative potential contaminant sources for the case study. At any active model cell, if a random generated number is less than the probability of pollution occurrence, then the risk model will generate synthetic contaminant source term as an input into the transport model. The results of the applications of the method are presented in the form of tables, graphs and spatial maps. Varying the model grid sizes indicates no significant effects on the simulated groundwater head. The simulated frequency of daily occurrence of pollution incidents is also independent of the model dimensions. However, the simulated total contaminant mass generated within the aquifer, and the associated volumetric numerical error appear to increase with the increasing grid sizes. Also, the migration of contaminant plume advances faster with the coarse grid sizes as compared to the finer grid sizes. The number of daily contaminant source terms generated and consequently the total mass of contaminant within the aquifer increases in a non linear proportion to the increasing frequency of occurrence of pollution events. The risk of pollution from a number of sources all occurring by chance together was evaluated, and quantitatively presented as risk maps. This capability to combine the risk to a groundwater feature from numerous potential sources of pollution proved to be a great asset to the method, and a large benefit over the contemporary risk and vulnerability methods.
Resumo:
Jackson (2005) developed a hybrid model of personality and learning, known as the learning styles profiler (LSP) which was designed to span biological, socio-cognitive, and experiential research foci of personality and learning research. The hybrid model argues that functional and dysfunctional learning outcomes can be best understood in terms of how cognitions and experiences control, discipline, and re-express the biologically based scale of sensation-seeking. In two studies with part-time workers undertaking tertiary education (N=137 and 58), established models of approach and avoidance from each of the three different research foci were compared with Jackson's hybrid model in their predictiveness of leadership, work, and university outcomes using self-report and supervisor ratings. Results showed that the hybrid model was generally optimal and, as hypothesized, that goal orientation was a mediator of sensation-seeking on outcomes (work performance, university performance, leader behaviours, and counterproductive work behaviour). Our studies suggest that the hybrid model has considerable promise as a predictor of work and educational outcomes as well as dysfunctional outcomes.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
In this paper we discuss a fast Bayesian extension to kriging algorithms which has been used successfully for fast, automatic mapping in emergency conditions in the Spatial Interpolation Comparison 2004 (SIC2004) exercise. The application of kriging to automatic mapping raises several issues such as robustness, scalability, speed and parameter estimation. Various ad-hoc solutions have been proposed and used extensively but they lack a sound theoretical basis. In this paper we show how observations can be projected onto a representative subset of the data, without losing significant information. This allows the complexity of the algorithm to grow as O(n m 2), where n is the total number of observations and m is the size of the subset of the observations retained for prediction. The main contribution of this paper is to further extend this projective method through the application of space-limited covariance functions, which can be used as an alternative to the commonly used covariance models. In many real world applications the correlation between observations essentially vanishes beyond a certain separation distance. Thus it makes sense to use a covariance model that encompasses this belief since this leads to sparse covariance matrices for which optimised sparse matrix techniques can be used. In the presence of extreme values we show that space-limited covariance functions offer an additional benefit, they maintain the smoothness locally but at the same time lead to a more robust, and compact, global model. We show the performance of this technique coupled with the sparse extension to the kriging algorithm on synthetic data and outline a number of computational benefits such an approach brings. To test the relevance to automatic mapping we apply the method to the data used in a recent comparison of interpolation techniques (SIC2004) to map the levels of background ambient gamma radiation. © Springer-Verlag 2007.
Resumo:
This thesis addresses data assimilation, which typically refers to the estimation of the state of a physical system given a model and observations, and its application to short-term precipitation forecasting. A general introduction to data assimilation is given, both from a deterministic and' stochastic point of view. Data assimilation algorithms are reviewed, in the static case (when no dynamics are involved), then in the dynamic case. A double experiment on two non-linear models, the Lorenz 63 and the Lorenz 96 models, is run and the comparative performance of the methods is discussed in terms of quality of the assimilation, robustness "in the non-linear regime and computational time. Following the general review and analysis, data assimilation is discussed in the particular context of very short-term rainfall forecasting (nowcasting) using radar images. An extended Bayesian precipitation nowcasting model is introduced. The model is stochastic in nature and relies on the spatial decomposition of the rainfall field into rain "cells". Radar observations are assimilated using a Variational Bayesian method in which the true posterior distribution of the parameters is approximated by a more tractable distribution. The motion of the cells is captured by a 20 Gaussian process. The model is tested on two precipitation events, the first dominated by convective showers, the second by precipitation fronts. Several deterministic and probabilistic validation methods are applied and the model is shown to retain reasonable prediction skill at up to 3 hours lead time. Extensions to the model are discussed.
Resumo:
Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.