59 resultados para Bayesian phylogeny
Resumo:
With the proliferation of social media sites, social streams have proven to contain the most up-to-date information on current events. Therefore, it is crucial to extract events from the social streams such as tweets. However, it is not straightforward to adapt the existing event extraction systems since texts in social media are fragmented and noisy. In this paper we propose a simple and yet effective Bayesian model, called Latent Event Model (LEM), to extract structured representation of events from social media. LEM is fully unsupervised and does not require annotated data for training. We evaluate LEM on a Twitter corpus. Experimental results show that the proposed model achieves 83% in F-measure, and outperforms the state-of-the-art baseline by over 7%.© 2014 Association for Computational Linguistics.
Resumo:
The twin arginine translocation (TAT) system ferries folded proteins across the bacterial membrane. Proteins are directed into this system by the TAT signal peptide present at the amino terminus of the precursor protein, which contains the twin arginine residues that give the system its name. There are currently only two computational methods for the prediction of TAT translocated proteins from sequence. Both methods have limitations that make the creation of a new algorithm for TAT-translocated protein prediction desirable. We have developed TATPred, a new sequence-model method, based on a Nave-Bayesian network, for the prediction of TAT signal peptides. In this approach, a comprehensive range of models was tested to identify the most reliable and robust predictor. The best model comprised 12 residues: three residues prior to the twin arginines and the seven residues that follow them. We found a prediction sensitivity of 0.979 and a specificity of 0.942.
Resumo:
Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.
Resumo:
Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.
Resumo:
Calibration of stochastic traffic microsimulation models is a challenging task. This paper proposes a fast iterative probabilistic precalibration framework and demonstrates how it can be successfully applied to a real-world traffic simulation model of a section of the M40 motorway and its surrounding area in the U.K. The efficiency of the method stems from the use of emulators of the stochastic microsimulator, which provides fast surrogates of the traffic model. The use of emulators minimizes the number of microsimulator runs required, and the emulators' probabilistic construction allows for the consideration of the extra uncertainty introduced by the approximation. It is shown that automatic precalibration of this real-world microsimulator, using turn-count observational data, is possible, considering all parameters at once, and that this precalibrated microsimulator improves on the fit to observations compared with the traditional expertly tuned microsimulation. © 2000-2011 IEEE.
Resumo:
The objective of this study was to investigate the effects of circularity, comorbidity, prevalence and presentation variation on the accuracy of differential diagnoses made in optometric primary care using a modified form of naïve Bayesian sequential analysis. No such investigation has ever been reported before. Data were collected for 1422 cases seen over one year. Positive test outcomes were recorded for case history (ethnicity, age, symptoms and ocular and medical history) and clinical signs in relation to each diagnosis. For this reason only positive likelihood ratios were used for this modified form of Bayesian analysis that was carried out with Laplacian correction and Chi-square filtration. Accuracy was expressed as the percentage of cases for which the diagnoses made by the clinician appeared at the top of a list generated by Bayesian analysis. Preliminary analyses were carried out on 10 diagnoses and 15 test outcomes. Accuracy of 100% was achieved in the absence of presentation variation but dropped by 6% when variation existed. Circularity artificially elevated accuracy by 0.5%. Surprisingly, removal of Chi-square filtering increased accuracy by 0.4%. Decision tree analysis showed that accuracy was influenced primarily by prevalence followed by presentation variation and comorbidity. Analysis of 35 diagnoses and 105 test outcomes followed. This explored the use of positive likelihood ratios, derived from the case history, to recommend signs to look for. Accuracy of 72% was achieved when all clinical signs were entered. The drop in accuracy, compared to the preliminary analysis, was attributed to the fact that some diagnoses lacked strong diagnostic signs; the accuracy increased by 1% when only recommended signs were entered. Chi-square filtering improved recommended test selection. Decision tree analysis showed that accuracy again influenced primarily by prevalence, followed by comorbidity and presentation variation. Future work will explore the use of likelihood ratios based on positive and negative test findings prior to considering naïve Bayesian analysis as a form of artificial intelligence in optometric practice.
Resumo:
In this letter, we derive continuum equations for the generalization error of the Bayesian online algorithm (BOnA) for the one-layer perceptron with a spherical covariance matrix using the Rosenblatt potential and show, by numerical calculations, that the asymptotic performance of the algorithm is the same as the one for the optimal algorithm found by means of variational methods with the added advantage that the BOnA does not use any inaccessible information during learning. © 2007 IEEE.
Resumo:
In the specific area of software engineering (SE) for self-adaptive systems (SASs) there is a growing research awareness about the synergy between SE and artificial intelligence (AI). However, just few significant results have been published so far. In this paper, we propose a novel and formal Bayesian definition of surprise as the basis for quantitative analysis to measure degrees of uncertainty and deviations of self-adaptive systems from normal behavior. A surprise measures how observed data affects the models or assumptions of the world during runtime. The key idea is that a "surprising" event can be defined as one that causes a large divergence between the belief distributions prior to and posterior to the event occurring. In such a case the system may decide either to adapt accordingly or to flag that an abnormal situation is happening. In this paper, we discuss possible applications of Bayesian theory of surprise for the case of self-adaptive systems using Bayesian dynamic decision networks. Copyright © 2014 ACM.
Resumo:
This article proposes a Bayesian neural network approach to determine the risk of re-intervention after endovascular aortic aneurysm repair surgery. The target of proposed technique is to determine which patients have high chance to re-intervention (high-risk patients) and which are not (low-risk patients) after 5 years of the surgery. Two censored datasets relating to the clinical conditions of aortic aneurysms have been collected from two different vascular centers in the United Kingdom. A Bayesian network was first employed to solve the censoring issue in the datasets. Then, a back propagation neural network model was built using the uncensored data of the first center to predict re-intervention on the second center and classify the patients into high-risk and low-risk groups. Kaplan-Meier curves were plotted for each group of patients separately to show whether there is a significant difference between the two risk groups. Finally, the logrank test was applied to determine whether the neural network model was capable of predicting and distinguishing between the two risk groups. The results show that the Bayesian network used for uncensoring the data has improved the performance of the neural networks that were built for the two centers separately. More importantly, the neural network that was trained with uncensored data of the first center was able to predict and discriminate between groups of low risk and high risk of re-intervention after 5 years of endovascular aortic aneurysm surgery at center 2 (p = 0.0037 in the logrank test).
Resumo:
The inverse controller is traditionally assumed to be a deterministic function. This paper presents a pedagogical methodology for estimating the stochastic model of the inverse controller. The proposed method is based on Bayes' theorem. Using Bayes' rule to obtain the stochastic model of the inverse controller allows the use of knowledge of uncertainty from both the inverse and the forward model in estimating the optimal control signal. The paper presents the methodology for general nonlinear systems and is demonstrated on nonlinear single-input-single-output (SISO) and multiple-input-multiple-output (MIMO) examples. © 2006 IEEE.
Resumo:
The inverse controller is traditionally assumed to be a deterministic function. This paper presents a pedagogical methodology for estimating the stochastic model of the inverse controller. The proposed method is based on Bayes' theorem. Using Bayes' rule to obtain the stochastic model of the inverse controller allows the use of knowledge of uncertainty from both the inverse and the forward model in estimating the optimal control signal. The paper presents the methodology for general nonlinear systems. For illustration purposes, the proposed methodology is applied to linear Gaussian systems. © 2004 IEEE.
Resumo:
In this paper we develop set of novel Markov Chain Monte Carlo algorithms for Bayesian smoothing of partially observed non-linear diffusion processes. The sampling algorithms developed herein use a deterministic approximation to the posterior distribution over paths as the proposal distribution for a mixture of an independence and a random walk sampler. The approximating distribution is sampled by simulating an optimized time-dependent linear diffusion process derived from the recently developed variational Gaussian process approximation method. The novel diffusion bridge proposal derived from the variational approximation allows the use of a flexible blocking strategy that further improves mixing, and thus the efficiency, of the sampling algorithms. The algorithms are tested on two diffusion processes: one with double-well potential drift and another with SINE drift. The new algorithm's accuracy and efficiency is compared with state-of-the-art hybrid Monte Carlo based path sampling. It is shown that in practical, finite sample applications the algorithm is accurate except in the presence of large observation errors and low to a multi-modal structure in the posterior distribution over paths. More importantly, the variational approximation assisted sampling algorithm outperforms hybrid Monte Carlo in terms of computational efficiency, except when the diffusion process is densely observed with small errors in which case both algorithms are equally efficient. © 2011 Springer-Verlag.
Resumo:
Storyline detection from news articles aims at summarizing events described under a certain news topic and revealing how those events evolve over time. It is a difficult task because it requires first the detection of events from news articles published in different time periods and then the construction of storylines by linking events into coherent news stories. Moreover, each storyline has different hierarchical structures which are dependent across epochs. Existing approaches often ignore the dependency of hierarchical structures in storyline generation. In this paper, we propose an unsupervised Bayesian model, called dynamic storyline detection model, to extract structured representations and evolution patterns of storylines. The proposed model is evaluated on a large scale news corpus. Experimental results show that our proposed model outperforms several baseline approaches.
Resumo:
Lifelong surveillance is not cost-effective after endovascular aneurysm repair (EVAR), but is required to detect aortic complications which are fatal if untreated (type 1/3 endoleak, sac expansion, device migration). Aneurysm morphology determines the probability of aortic complications and therefore the need for surveillance, but existing analyses have proven incapable of identifying patients at sufficiently low risk to justify abandoning surveillance. This study aimed to improve the prediction of aortic complications, through the application of machine-learning techniques. Patients undergoing EVAR at 2 centres were studied from 2004–2010. Aneurysm morphology had previously been studied to derive the SGVI Score for predicting aortic complications. Bayesian Neural Networks were designed using the same data, to dichotomise patients into groups at low- or high-risk of aortic complications. Network training was performed only on patients treated at centre 1. External validation was performed by assessing network performance independently of network training, on patients treated at centre 2. Discrimination was assessed by Kaplan-Meier analysis to compare aortic complications in predicted low-risk versus predicted high-risk patients. 761 patients aged 75 +/− 7 years underwent EVAR in 2 centres. Mean follow-up was 36+/− 20 months. Neural networks were created incorporating neck angu- lation/length/diameter/volume; AAA diameter/area/volume/length/tortuosity; and common iliac tortuosity/diameter. A 19-feature network predicted aor- tic complications with excellent discrimination and external validation (5-year freedom from aortic complications in predicted low-risk vs predicted high-risk patients: 97.9% vs. 63%; p < 0.0001). A Bayesian Neural-Network algorithm can identify patients in whom it may be safe to abandon surveillance after EVAR. This proposal requires prospective study.