972 resultados para Bayesian modelling
Resumo:
This study considered the problem of predicting survival, based on three alternative models: a single Weibull, a mixture of Weibulls and a cure model. Instead of the common procedure of choosing a single “best” model, where “best” is defined in terms of goodness of fit to the data, a Bayesian model averaging (BMA) approach was adopted to account for model uncertainty. This was illustrated using a case study in which the aim was the description of lymphoma cancer survival with covariates given by phenotypes and gene expression. The results of this study indicate that if the sample size is sufficiently large, one of the three models emerge as having highest probability given the data, as indicated by the goodness of fit measure; the Bayesian information criterion (BIC). However, when the sample size was reduced, no single model was revealed as “best”, suggesting that a BMA approach would be appropriate. Although a BMA approach can compromise on goodness of fit to the data (when compared to the true model), it can provide robust predictions and facilitate more detailed investigation of the relationships between gene expression and patient survival. Keywords: Bayesian modelling; Bayesian model averaging; Cure model; Markov Chain Monte Carlo; Mixture model; Survival analysis; Weibull distribution
Resumo:
Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.
Resumo:
This thesis has contributed to the advancement of knowledge in disease modelling by addressing interesting and crucial issues relevant to modelling health data over space and time. The research has led to the increased understanding of spatial scales, temporal scales, and spatial smoothing for modelling diseases, in terms of their methodology and applications. This research is of particular significance to researchers seeking to employ statistical modelling techniques over space and time in various disciplines. A broad class of statistical models are employed to assess what impact of spatial and temporal scales have on simulated and real data.
Resumo:
This thesis introduces a method of applying Bayesian Networks to combine information from a range of data sources for effective decision support systems. It develops a set of techniques in development, validation, visualisation, and application of Complex Systems models, with a working demonstration in an Australian airport environment. The methods presented here have provided a modelling approach that produces highly flexible, informative and applicable interpretations of a system's behaviour under uncertain conditions. These end-to-end techniques are applied to the development of model based dashboards to support operators and decision makers in the multi-stakeholder airport environment. They provide highly flexible and informative interpretations and confidence in these interpretations of a system's behaviour under uncertain conditions.
Resumo:
In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.
Resumo:
Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.
Resumo:
Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.
Resumo:
Monotony has been identified as a contributing factor to road crashes. Drivers’ ability to react to unpredictable events deteriorates when exposed to highly predictable and uneventful driving tasks, such as driving on Australian rural roads, many of which are monotonous by nature. Highway design in particular attempts to reduce the driver’s task to a merely lane-keeping one. Such a task provides little stimulation and is monotonous, thus affecting the driver’s attention which is no longer directed towards the road. Inattention contributes to crashes, especially for professional drivers. Monotony has been studied mainly from the endogenous perspective (for instance through sleep deprivation) without taking into account the influence of the task itself (repetitiveness) or the surrounding environment. The aim and novelty of this thesis is to develop a methodology (mathematical framework) able to predict driver lapses of vigilance under monotonous environments in real time, using endogenous and exogenous data collected from the driver, the vehicle and the environment. Existing approaches have tended to neglect the specificity of task monotony, leaving the question of the existence of a “monotonous state” unanswered. Furthermore the issue of detecting vigilance decrement before it occurs (predictions) has not been investigated in the literature, let alone in real time. A multidisciplinary approach is necessary to explain how vigilance evolves in monotonous conditions. Such an approach needs to draw on psychology, physiology, road safety, computer science and mathematics. The systemic approach proposed in this study is unique with its predictive dimension and allows us to define, in real time, the impacts of monotony on the driver’s ability to drive. Such methodology is based on mathematical models integrating data available in vehicles to the vigilance state of the driver during a monotonous driving task in various environments. The model integrates different data measuring driver’s endogenous and exogenous factors (related to the driver, the vehicle and the surrounding environment). Electroencephalography (EEG) is used to measure driver vigilance since it has been shown to be the most reliable and real time methodology to assess vigilance level. There are a variety of mathematical models suitable to provide a framework for predictions however, to find the most accurate model, a collection of mathematical models were trained in this thesis and the most reliable was found. The methodology developed in this research is first applied to a theoretically sound measure of sustained attention called Sustained Attention Response to Task (SART) as adapted by Michael (2010), Michael and Meuter (2006, 2007). This experiment induced impairments due to monotony during a vigilance task. Analyses performed in this thesis confirm and extend findings from Michael (2010) that monotony leads to an important vigilance impairment independent of fatigue. This thesis is also the first to show that monotony changes the dynamics of vigilance evolution and tends to create a “monotonous state” characterised by reduced vigilance. Personality traits such as being a low sensation seeker can mitigate this vigilance decrement. It is also evident that lapses in vigilance can be predicted accurately with Bayesian modelling and Neural Networks. This framework was then applied to the driving task by designing a simulated monotonous driving task. The design of such task requires multidisciplinary knowledge and involved psychologist Rebecca Michael. Monotony was varied through both the road design and the road environment variables. This experiment demonstrated that road monotony can lead to driving impairment. Particularly monotonous road scenery was shown to have the most impact compared to monotonous road design. Next, this study identified a variety of surrogate measures that are correlated with vigilance levels obtained from the EEG. Such vigilance states can be predicted with these surrogate measures. This means that vigilance decrement can be detected in a car without the use of an EEG device. Amongst the different mathematical models tested in this thesis, only Neural Networks predicted the vigilance levels accurately. The results of both these experiments provide valuable information about the methodology to predict vigilance decrement. Such an issue is quite complex and requires modelling that can adapt to highly inter-individual differences. Only Neural Networks proved accurate in both studies, suggesting that these models are the most likely to be accurate when used on real roads or for further research on vigilance modelling. This research provides a better understanding of the driving task under monotonous conditions. Results demonstrate that mathematical modelling can be used to determine the driver’s vigilance state when driving using surrogate measures identified during this study. This research has opened up avenues for future research and could result in the development of an in-vehicle device predicting driver vigilance decrement. Such a device could contribute to a reduction in crashes and therefore improve road safety.
Resumo:
Intelligible and accurate risk-based decision-making requires a complex balance of information from different sources, appropriate statistical analysis of this information and consequent intelligent inference and decisions made on the basis of these analyses. Importantly, this requires an explicit acknowledgement of uncertainty in the inputs and outputs of the statistical model. The aim of this paper is to progress a discussion of these issues in the context of several motivating problems related to the wider scope of agricultural production. These problems include biosecurity surveillance design, pest incursion, environmental monitoring and import risk assessment. The information to be integrated includes observational and experimental data, remotely sensed data and expert information. We describe our efforts in addressing these problems using Bayesian models and Bayesian networks. These approaches provide a coherent and transparent framework for modelling complex systems, combining the different information sources, and allowing for uncertainty in inputs and outputs. While the theory underlying Bayesian modelling has a long and well established history, its application is only now becoming more possible for complex problems, due to increased availability of methodological and computational tools. Of course, there are still hurdles and constraints, which we also address through sharing our endeavours and experiences.
Resumo:
An experimental study has been performed to investigate the ignition delay of a modern heavy-duty common-rail diesel engine run with fumigated ethanol substitutions up to 40% on an energy basis. The ignition delay was determined through the use of statistical modelling in a Bayesian framework this framework allows for the accurate determination of the start of combustion from single consecutive cycles and does not require any differentiation of the in-cylinder pressure signal. At full load the ignition delay has been shown to decrease with increasing ethanol substitutions and evidence of combustion with high ethanol substitutions prior to diesel injection have also been shown experimentally and by modelling. Whereas, at half load increasing ethanol substitutions have increased the ignition delay. A threshold absolute air to fuel ratio (mole basis) of above ~110 for consistent operation has been determined from the inter-cycle variability of the ignition delay, a result that agrees well with previous research of other in-cylinder parameters and further highlights the correlation between the air to fuel ratio and inter-cycle variability. Numerical modelling to investigate the sensitivity of ethanol combustion has also been performed. It has been shown that ethanol combustion is sensitive to the initial air temperature around the feasible operating conditions of the engine. Moreover, a negative temperature coefficient region of approximately 900{1050 K (the approximate temperature at fuel injection) has been shown with for n-heptane and n-heptane/ethanol blends in the numerical modelling. A consequence of this is that the dominate effect influencing the ignition delay under increasing ethanol substitutions may rather be from an increase in chemical reactions and not from in-cylinder temperature. Further investigation revealed that the chemical reactions at low ethanol substitutions are different compared to the high (> 20%) ethanol substitutions.
Resumo:
Introduced in this paper is a Bayesian model for isolating the resonant frequency from combustion chamber resonance. The model shown in this paper focused on characterising the initial rise in the resonant frequency to investigate the rise of in-cylinder bulk temperature associated with combustion. By resolving the model parameters, it is possible to determine: the start of pre-mixed combustion, the start of diffusion combustion, the initial resonant frequency, the resonant frequency as a function of crank angle, the in-cylinder bulk temperature as a function of crank angle and the trapped mass as a function of crank angle. The Bayesian method allows for individual cycles to be examined without cycle-averaging|allowing inter-cycle variability studies. Results are shown for a turbo-charged, common-rail compression ignition engine run at 2000 rpm and full load.