20 resultados para Bayesian model selection
em Aston University Research Archive
Resumo:
A practical Bayesian approach for inference in neural network models has been available for ten years, and yet it is not used frequently in medical applications. In this chapter we show how both regularisation and feature selection can bring significant benefits in diagnostic tasks through two case studies: heart arrhythmia classification based on ECG data and the prognosis of lupus. In the first of these, the number of variables was reduced by two thirds without significantly affecting performance, while in the second, only the Bayesian models had an acceptable accuracy. In both tasks, neural networks outperformed other pattern recognition approaches.
Resumo:
Social streams have proven to be the mostup-to-date and inclusive information on cur-rent events. In this paper we propose a novelprobabilistic modelling framework, called violence detection model (VDM), which enables the identification of text containing violent content and extraction of violence-related topics over social media data. The proposed VDM model does not require any labeled corpora for training, instead, it only needs the in-corporation of word prior knowledge which captures whether a word indicates violence or not. We propose a novel approach of deriving word prior knowledge using the relative entropy measurement of words based on the in-tuition that low entropy words are indicative of semantically coherent topics and therefore more informative, while high entropy words indicates words whose usage is more topical diverse and therefore less informative. Our proposed VDM model has been evaluated on the TREC Microblog 2011 dataset to identify topics related to violence. Experimental results show that deriving word priors using our proposed relative entropy method is more effective than the widely-used information gain method. Moreover, VDM gives higher violence classification results and produces more coherent violence-related topics compared toa few competitive baselines.
Resumo:
We discuss aggregation of data from neuropsychological patients and the process of evaluating models using data from a series of patients. We argue that aggregation can be misleading but not aggregating can also result in information loss. The basis for combining data needs to be theoretically defined, and the particular method of aggregation depends on the theoretical question and characteristics of the data. We present examples, often drawn from our own research, to illustrate these points. We also argue that statistical models and formal methods of model selection are a useful way to test theoretical accounts using data from several patients in multiple-case studies or case series. Statistical models can often measure fit in a way that explicitly captures what a theory allows; the parameter values that result from model fitting often measure theoretically important dimensions and can lead to more constrained theories or new predictions; and model selection allows the strength of evidence for models to be quantified without forcing this into the artificial binary choice that characterizes hypothesis testing methods. Methods that aggregate and then formally model patient data, however, are not automatically preferred to other methods. Which method is preferred depends on the question to be addressed, characteristics of the data, and practical issues like availability of suitable patients, but case series, multiple-case studies, single-case studies, statistical models, and process models should be complementary methods when guided by theory development.
Resumo:
With the proliferation of social media sites, social streams have proven to contain the most up-to-date information on current events. Therefore, it is crucial to extract events from the social streams such as tweets. However, it is not straightforward to adapt the existing event extraction systems since texts in social media are fragmented and noisy. In this paper we propose a simple and yet effective Bayesian model, called Latent Event Model (LEM), to extract structured representation of events from social media. LEM is fully unsupervised and does not require annotated data for training. We evaluate LEM on a Twitter corpus. Experimental results show that the proposed model achieves 83% in F-measure, and outperforms the state-of-the-art baseline by over 7%.© 2014 Association for Computational Linguistics.
Resumo:
Storyline detection from news articles aims at summarizing events described under a certain news topic and revealing how those events evolve over time. It is a difficult task because it requires first the detection of events from news articles published in different time periods and then the construction of storylines by linking events into coherent news stories. Moreover, each storyline has different hierarchical structures which are dependent across epochs. Existing approaches often ignore the dependency of hierarchical structures in storyline generation. In this paper, we propose an unsupervised Bayesian model, called dynamic storyline detection model, to extract structured representations and evolution patterns of storylines. The proposed model is evaluated on a large scale news corpus. Experimental results show that our proposed model outperforms several baseline approaches.
Resumo:
We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler 'naive' mean field theory and support vector machines (SVM) as limiting cases. For both mean field algorithms and support vectors machines, simulation results for three small benchmark data sets are presented. They show 1. that one may get state of the art performance by using the leave-one-out estimator for model selection and 2. the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The latter result is a taken as a strong support for the internal consistency of the mean field approach.
Resumo:
Models of visual motion processing that introduce priors for low speed through Bayesian computations are sometimes treated with scepticism by empirical researchers because of the convenient way in which parameters of the Bayesian priors have been chosen. Using the effects of motion adaptation on motion perception to illustrate, we show that the Bayesian prior, far from being convenient, may be estimated on-line and therefore represents a useful tool by which visual motion processes may be optimized in order to extract the motion signals commonly encountered in every day experience. The prescription for optimization, when combined with system constraints on the transmission of visual information, may lead to an exaggeration of perceptual bias through the process of adaptation. Our approach extends the Bayesian model of visual motion proposed byWeiss et al. [Weiss Y., Simoncelli, E., & Adelson, E. (2002). Motion illusions as optimal perception Nature Neuroscience, 5:598-604.], in suggesting that perceptual bias reflects a compromise taken by a rational system in the face of uncertain signals and system constraints. © 2007.
Resumo:
We report the case of a neologistic jargonaphasic and ask whether her target-related and abstruse neologisms are the result of a single deficit, which affects some items more severely than others, or two deficits: one to lexical access and the other to phonological encoding. We analyse both correct/incorrect performance and errors and apply both traditional and formal methods (maximum-likelihood estimation and model selection). All evidence points to a single deficit at the level of phonological encoding. Further characteristics are used to constrain the locus still further. V.S. does not show the type of length effect expected of a memory component, nor the pattern of errors associated with an articulatory deficit. We conclude that her neologistic errors can result from a single deficit at a level of phonological encoding that immediately follows lexical access where segments are represented in terms of their features. We do not conclude, however, that this is the only possible locus that will produce phonological errors in aphasia, or, indeed, jargonaphasia.
Resumo:
The thesis presents an experimentally validated modelling study of the flow of combustion air in an industrial radiant tube burner (RTB). The RTB is used typically in industrial heat treating furnaces. The work has been initiated because of the need for improvements in burner lifetime and performance which are related to the fluid mechanics of the com busting flow, and a fundamental understanding of this is therefore necessary. To achieve this, a detailed three-dimensional Computational Fluid Dynamics (CFD) model has been used, validated with experimental air flow, temperature and flue gas measurements. Initially, the work programme is presented and the theory behind RTB design and operation in addition to the theory behind swirling flows and methane combustion. NOx reduction techniques are discussed and numerical modelling of combusting flows is detailed in this section. The importance of turbulence, radiation and combustion modelling is highlighted, as well as the numerical schemes that incorporate discretization, finite volume theory and convergence. The study first focuses on the combustion air flow and its delivery to the combustion zone. An isothermal computational model was developed to allow the examination of the flow characteristics as it enters the burner and progresses through the various sections prior to the discharge face in the combustion area. Important features identified include the air recuperator swirler coil, the step ring, the primary/secondary air splitting flame tube and the fuel nozzle. It was revealed that the effectiveness of the air recuperator swirler is significantly compromised by the need for a generous assembly tolerance. Also, there is a substantial circumferential flow maldistribution introduced by the swirier, but that this is effectively removed by the positioning of a ring constriction in the downstream passage. Computations using the k-ε turbulence model show good agreement with experimentally measured velocity profiles in the combustion zone and proved the use of the modelling strategy prior to the combustion study. Reasonable mesh independence was obtained with 200,000 nodes. Agreement was poorer with the RNG k-ε and Reynolds Stress models. The study continues to address the combustion process itself and the heat transfer process internal to the RTB. A series of combustion and radiation model configurations were developed and the optimum combination of the Eddy Dissipation (ED) combustion model and the Discrete Transfer (DT) radiation model was used successfully to validate a burner experimental test. The previously cold flow validated k-ε turbulence model was used and reasonable mesh independence was obtained with 300,000 nodes. The combination showed good agreement with temperature measurements in the inner and outer walls of the burner, as well as with flue gas composition measured at the exhaust. The inner tube wall temperature predictions validated the experimental measurements in the largest portion of the thermocouple locations, highlighting a small flame bias to one side, although the model slightly over predicts the temperatures towards the downstream end of the inner tube. NOx emissions were initially over predicted, however, the use of a combustion flame temperature limiting subroutine allowed convergence to the experimental value of 451 ppmv. With the validated model, the effectiveness of certain RTB features identified previously is analysed, and an analysis of the energy transfers throughout the burner is presented, to identify the dominant mechanisms in each region. The optimum turbulence-combustion-radiation model selection was then the baseline for further model development. One of these models, an eccentrically positioned flame tube model highlights the failure mode of the RTB during long term operation. Other models were developed to address NOx reduction and improvement of the flame profile in the burner combustion zone. These included a modified fuel nozzle design, with 12 circular section fuel ports, which demonstrates a longer and more symmetric flame, although with limited success in NOx reduction. In addition, a zero bypass swirler coil model was developed that highlights the effect of the stronger swirling combustion flow. A reduced diameter and a 20 mm forward displaced flame tube model shows limited success in NOx reduction; although the latter demonstrated improvements in the discharge face heat distribution and improvements in the flame symmetry. Finally, Flue Gas Recirculation (FGR) modelling attempts indicate the difficulty of the application of this NOx reduction technique in the Wellman RTB. Recommendations for further work are made that include design mitigations for the fuel nozzle and further burner modelling is suggested to improve computational validation. The introduction of fuel staging is proposed, as well as a modification in the inner tube to enhance the effect of FGR.
Resumo:
This thesis is a study of low-dimensional visualisation methods for data visualisation under certainty of the input data. It focuses on the two main feed-forward neural network algorithms which are NeuroScale and Generative Topographic Mapping (GTM) by trying to make both algorithms able to accommodate the uncertainty. The two models are shown not to work well under high levels of noise within the data and need to be modified. The modification of both models, NeuroScale and GTM, are verified by using synthetic data to show their ability to accommodate the noise. The thesis is interested in the controversy surrounding the non-uniqueness of predictive gene lists (PGL) of predicting prognosis outcome of breast cancer patients as available in DNA microarray experiments. Many of these studies have ignored the uncertainty issue resulting in random correlations of sparse model selection in high dimensional spaces. The visualisation techniques are used to confirm that the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of ‘unclassifiable’ should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Resumo:
Background: The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged 1. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods: We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results: The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion: The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Resumo:
Emotional liability and mood dysregulation characterize bipolar disorder (BD), yet no study has examined effective connectivity between parahippocampal gyrus and prefrontal cortical regions in ventromedial and dorsal/lateral neural systems subserving mood regulation in BD. Participants comprised 46 individuals (age range: 18-56 years): 21 with a DSM-IV diagnosis of BD, type I currently remitted; and 25 age- and gender-matched healthy controls (HC). Participants performed an event-related functional magnetic resonance imaging paradigm, viewing mild and intense happy and neutral faces. We employed dynamic causal modeling (DCM) to identify significant alterations in effective connectivity between BD and HC. Bayes model selection was used to determine the best model. The right parahippocampal gyrus (PHG) and right subgenual cingulate gyrus (sgCG) were included as representative regions of the ventromedial neural system. The right dorsolateral prefrontal cortex (DLPFC) region was included as representative of the dorsal/lateral neural system. Right PHG-sgCG effective connectivity was significantly greater in BD than HC, reflecting more rapid, forward PHG-sgCG signaling in BD than HC. There was no between-group difference in sgCG-DLPFC effective connectivity. In BD, abnormally increased right PHG-sgCG effective connectivity and reduced right PHG activity to emotional stimuli suggest a dysfunctional ventromedial neural system implicated in early stimulus appraisal, encoding and automatic regulation of emotion that may represent a pathophysiological functional neural mechanism for mood dysregulation in BD.
Resumo:
We propose a Bayesian framework for regression problems, which covers areas which are usually dealt with by function approximation. An online learning algorithm is derived which solves regression problems with a Kalman filter. Its solution always improves with increasing model complexity, without the risk of over-fitting. In the infinite dimension limit it approaches the true Bayesian posterior. The issues of prior selection and over-fitting are also discussed, showing that some of the commonly held beliefs are misleading. The practical implementation is summarised. Simulations using 13 popular publicly available data sets are used to demonstrate the method and highlight important issues concerning the choice of priors.
Resumo:
This paper suggests a data envelopment analysis (DEA) model for selecting the most efficient alternative in advanced manufacturing technology in the presence of both cardinal and ordinal data. The paper explains the problem of using an iterative method for finding the most efficient alternative and proposes a new DEA model without the need of solving a series of LPs. A numerical example illustrates the model, and an application in technology selection with multi-inputs/multi-outputs shows the usefulness of the proposed approach. © 2012 Springer-Verlag London Limited.
Resumo:
Suboptimal maternal nutrition during gestation results in the establishment of long-term phenotypic changes and an increased disease risk in the offspring. To elucidate how such environmental sensitivity results in physiological outcomes, the molecular characterisation of these offspring has become the focus of many studies. However, the likely modification of key cellular processes such as metabolism in response to maternal undernutrition raises the question of whether the genes typically used as reference constants in gene expression studies are suitable controls. Using a mouse model of maternal protein undernutrition, we have investigated the stability of seven commonly used reference genes (18s, Hprt1, Pgk1, Ppib, Sdha, Tbp and Tuba1) in a variety of offspring tissues including liver, kidney, heart, retro-peritoneal and inter-scapular fat, extra-embryonic placenta and yolk sac, as well as in the preimplantation blastocyst and blastocyst-derived embryonic stem cells. We find that although the selected reference genes are all highly stable within this system, they show tissue, treatment and sex-specific variation. Furthermore, software-based selection approaches rank reference genes differently and do not always identify genes which differ between conditions. Therefore, we recommend that reference gene selection for gene expression studies should be thoroughly validated for each tissue of interest. © 2011 Elsevier Inc.