275 resultados para Bayesian hierarchical models
Resumo:
Background In order to increase the efficient allocation of soil-transmitted helminth (STH) disease control resources in the Philippines, we aimed to describe for the first time the spatial variation in the prevalence of A. lumbricoides, T. trichiura and hookworm across the country, quantify the association between the physical environment and spatial variation of STH infection and develop predictive risk maps for each infection. Methodology/Principal Findings Data on STH infection from 35,573 individuals across the country were geolocated at the barangay level and included in the analysis. The analysis was stratified geographically in two major regions: 1) Luzon and the Visayas and 2) Mindanao. Bayesian geostatistical models of STH prevalence were developed, including age and sex of individuals and environmental variables (rainfall, land surface temperature and distance to inland water bodies) as predictors, and diagnostic uncertainty was incorporated. The role of environmental variables was different between regions of the Philippines. This analysis revealed that while A. lumbricoides and T. trichiura infections were widespread and highly endemic, hookworm infections were more circumscribed to smaller foci in the Visayas and Mindanao. Conclusions/Significance This analysis revealed significant spatial variation in STH infection prevalence within provinces of the Philippines. This suggests that a spatially targeted approach to STH interventions, including mass drug administration, is warranted. When financially possible, additional STH surveys should be prioritized to high-risk areas identified by our study in Luzon.
Resumo:
Plant biosecurity requires statistical tools to interpret field surveillance data in order to manage pest incursions that threaten crop production and trade. Ultimately, management decisions need to be based on the probability that an area is infested or free of a pest. Current informal approaches to delimiting pest extent rely upon expert ecological interpretation of presence / absence data over space and time. Hierarchical Bayesian models provide a cohesive statistical framework that can formally integrate the available information on both pest ecology and data. The overarching method involves constructing an observation model for the surveillance data, conditional on the hidden extent of the pest and uncertain detection sensitivity. The extent of the pest is then modelled as a dynamic invasion process that includes uncertainty in ecological parameters. Modelling approaches to assimilate this information are explored through case studies on spiralling whitefly, Aleurodicus dispersus and red banded mango caterpillar, Deanolis sublimbalis. Markov chain Monte Carlo simulation is used to estimate the probable extent of pests, given the observation and process model conditioned by surveillance data. Statistical methods, based on time-to-event models, are developed to apply hierarchical Bayesian models to early detection programs and to demonstrate area freedom from pests. The value of early detection surveillance programs is demonstrated through an application to interpret surveillance data for exotic plant pests with uncertain spread rates. The model suggests that typical early detection programs provide a moderate reduction in the probability of an area being infested but a dramatic reduction in the expected area of incursions at a given time. Estimates of spiralling whitefly extent are examined at local, district and state-wide scales. The local model estimates the rate of natural spread and the influence of host architecture, host suitability and inspector efficiency. These parameter estimates can support the development of robust surveillance programs. Hierarchical Bayesian models for the human-mediated spread of spiralling whitefly are developed for the colonisation of discrete cells connected by a modified gravity model. By estimating dispersal parameters, the model can be used to predict the extent of the pest over time. An extended model predicts the climate restricted distribution of the pest in Queensland. These novel human-mediated movement models are well suited to demonstrating area freedom at coarse spatio-temporal scales. At finer scales, and in the presence of ecological complexity, exploratory models are developed to investigate the capacity for surveillance information to estimate the extent of red banded mango caterpillar. It is apparent that excessive uncertainty about observation and ecological parameters can impose limits on inference at the scales required for effective management of response programs. The thesis contributes novel statistical approaches to estimating the extent of pests and develops applications to assist decision-making across a range of plant biosecurity surveillance activities. Hierarchical Bayesian modelling is demonstrated as both a useful analytical tool for estimating pest extent and a natural investigative paradigm for developing and focussing biosecurity programs.
Resumo:
Early detection surveillance programs aim to find invasions of exotic plant pests and diseases before they are too widespread to eradicate. However, the value of these programs can be difficult to justify when no positive detections are made. To demonstrate the value of pest absence information provided by these programs, we use a hierarchical Bayesian framework to model estimates of incursion extent with and without surveillance. A model for the latent invasion process provides the baseline against which surveillance data are assessed. Ecological knowledge and pest management criteria are introduced into the model using informative priors for invasion parameters. Observation models assimilate information from spatio-temporal presence/absence data to accommodate imperfect detection and generate posterior estimates of pest extent. When applied to an early detection program operating in Queensland, Australia, the framework demonstrates that this typical surveillance regime provides a modest reduction in the estimate that a surveyed district is infested. More importantly, the model suggests that early detection surveillance programs can provide a dramatic reduction in the putative area of incursion and therefore offer a substantial benefit to incursion management. By mapping spatial estimates of the point probability of infestation, the model identifies where future surveillance resources can be most effectively deployed.
Resumo:
Background Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival. Methods Multilevel discrete-time and Bayesian spatial survival models were used to study geographical inequalities in all-cause survival for a population-based colorectal cancer cohort of 22,727 cases aged 20–84 years diagnosed during 1997–2007 from Queensland, Australia. Results Both approaches were viable on this large dataset, and produced similar estimates of the fixed effects. After adding area-level covariates, the between-area variability in survival using multilevel discrete-time models was no longer significant. Spatial inequalities in survival were also markedly reduced after adjusting for aggregated area-level covariates. Only the multilevel approach however, provided an estimation of the contribution of geographical variation to the total variation in survival between individual patients. Conclusions With little difference observed between the two approaches in the estimation of fixed effects, multilevel models should be favored if there is a clear hierarchical data structure and measuring the independent impact of individual- and area-level effects on survival differences is of primary interest. Bayesian spatial analyses may be preferred if spatial correlation between areas is important and if the priority is to assess small-area variations in survival and map spatial patterns. Both approaches can be readily fitted to geographically enabled survival data from international settings
Resumo:
This thesis addresses computational challenges arising from Bayesian analysis of complex real-world problems. Many of the models and algorithms designed for such analysis are ‘hybrid’ in nature, in that they are a composition of components for which their individual properties may be easily described but the performance of the model or algorithm as a whole is less well understood. The aim of this research project is to after a better understanding of the performance of hybrid models and algorithms. The goal of this thesis is to analyse the computational aspects of hybrid models and hybrid algorithms in the Bayesian context. The first objective of the research focuses on computational aspects of hybrid models, notably a continuous finite mixture of t-distributions. In the mixture model, an inference of interest is the number of components, as this may relate to both the quality of model fit to data and the computational workload. The analysis of t-mixtures using Markov chain Monte Carlo (MCMC) is described and the model is compared to the Normal case based on the goodness of fit. Through simulation studies, it is demonstrated that the t-mixture model can be more flexible and more parsimonious in terms of number of components, particularly for skewed and heavytailed data. The study also reveals important computational issues associated with the use of t-mixtures, which have not been adequately considered in the literature. The second objective of the research focuses on computational aspects of hybrid algorithms for Bayesian analysis. Two approaches will be considered: a formal comparison of the performance of a range of hybrid algorithms and a theoretical investigation of the performance of one of these algorithms in high dimensions. For the first approach, the delayed rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin algorithm, and the hybrid version of the population Monte Carlo (PMC) algorithm are selected as a set of examples of hybrid algorithms. Statistical literature shows how statistical efficiency is often the only criteria for an efficient algorithm. In this thesis the algorithms are also considered and compared from a more practical perspective. This extends to the study of how individual algorithms contribute to the overall efficiency of hybrid algorithms, and highlights weaknesses that may be introduced by the combination process of these components in a single algorithm. The second approach to considering computational aspects of hybrid algorithms involves an investigation of the performance of the PMC in high dimensions. It is well known that as a model becomes more complex, computation may become increasingly difficult in real time. In particular the importance sampling based algorithms, including the PMC, are known to be unstable in high dimensions. This thesis examines the PMC algorithm in a simplified setting, a single step of the general sampling, and explores a fundamental problem that occurs in applying importance sampling to a high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of the estimate under conditions on the importance function. Additionally, the exponential growth of the asymptotic variance with the dimension is demonstrated and we illustrates that the optimal covariance matrix for the importance function can be estimated in a special case.
Resumo:
It is important to examine the nature of the relationships between roadway, environmental, and traffic factors and motor vehicle crashes, with the aim to improve the collective understanding of causal mechanisms involved in crashes and to better predict their occurrence. Statistical models of motor vehicle crashes are one path of inquiry often used to gain these initial insights. Recent efforts have focused on the estimation of negative binomial and Poisson regression models (and related deviants) due to their relatively good fit to crash data. Of course analysts constantly seek methods that offer greater consistency with the data generating mechanism (motor vehicle crashes in this case), provide better statistical fit, and provide insight into data structure that was previously unavailable. One such opportunity exists with some types of crash data, in particular crash-level data that are collected across roadway segments, intersections, etc. It is argued in this paper that some crash data possess hierarchical structure that has not routinely been exploited. This paper describes the application of binomial multilevel models of crash types using 548 motor vehicle crashes collected from 91 two-lane rural intersections in the state of Georgia. Crash prediction models are estimated for angle, rear-end, and sideswipe (both same direction and opposite direction) crashes. The contributions of the paper are the realization of hierarchical data structure and the application of a theoretically appealing and suitable analysis approach for multilevel data, yielding insights into intersection-related crashes by crash type.
Resumo:
A time series method for the determination of combustion chamber resonant frequencies is outlined. This technique employs the use of Markov-chain Monte Carlo (MCMC) to infer parameters in a chosen model of the data. The development of the model is included and the resonant frequency is characterised as a function of time. Potential applications for cycle-by-cycle analysis are discussed and the bulk temperature of the gas and the trapped mass in the combustion chamber are evaluated as a function of time from resonant frequency information.
Resumo:
Modern technology now has the ability to generate large datasets over space and time. Such data typically exhibit high autocorrelations over all dimensions. The field trial data motivating the methods of this paper were collected to examine the behaviour of traditional cropping and to determine a cropping system which could maximise water use for grain production while minimising leakage below the crop root zone. They consist of moisture measurements made at 15 depths across 3 rows and 18 columns, in the lattice framework of an agricultural field. Bayesian conditional autoregressive (CAR) models are used to account for local site correlations. Conditional autoregressive models have not been widely used in analyses of agricultural data. This paper serves to illustrate the usefulness of these models in this field, along with the ease of implementation in WinBUGS, a freely available software package. The innovation is the fitting of separate conditional autoregressive models for each depth layer, the ‘layered CAR model’, while simultaneously estimating depth profile functions for each site treatment. Modelling interest also lay in how best to model the treatment effect depth profiles, and in the choice of neighbourhood structure for the spatial autocorrelation model. The favoured model fitted the treatment effects as splines over depth, and treated depth, the basis for the regression model, as measured with error, while fitting CAR neighbourhood models by depth layer. It is hierarchical, with separate onditional autoregressive spatial variance components at each depth, and the fixed terms which involve an errors-in-measurement model treat depth errors as interval-censored measurement error. The Bayesian framework permits transparent specification and easy comparison of the various complex models compared.
Resumo:
PySSM is a Python package that has been developed for the analysis of time series using linear Gaussian state space models (SSM). PySSM is easy to use; models can be set up quickly and efficiently and a variety of different settings are available to the user. It also takes advantage of scientific libraries Numpy and Scipy and other high level features of the Python language. PySSM is also used as a platform for interfacing between optimised and parallelised Fortran routines. These Fortran routines heavily utilise Basic Linear Algebra (BLAS) and Linear Algebra Package (LAPACK) functions for maximum performance. PySSM contains classes for filtering, classical smoothing as well as simulation smoothing.
Resumo:
This thesis develops a detailed conceptual design method and a system software architecture defined with a parametric and generative evolutionary design system to support an integrated interdisciplinary building design approach. The research recognises the need to shift design efforts toward the earliest phases of the design process to support crucial design decisions that have a substantial cost implication on the overall project budget. The overall motivation of the research is to improve the quality of designs produced at the author's employer, the General Directorate of Major Works (GDMW) of the Saudi Arabian Armed Forces. GDMW produces many buildings that have standard requirements, across a wide range of environmental and social circumstances. A rapid means of customising designs for local circumstances would have significant benefits. The research considers the use of evolutionary genetic algorithms in the design process and the ability to generate and assess a wider range of potential design solutions than a human could manage. This wider ranging assessment, during the early stages of the design process, means that the generated solutions will be more appropriate for the defined design problem. The research work proposes a design method and system that promotes a collaborative relationship between human creativity and the computer capability. The tectonic design approach is adopted as a process oriented design that values the process of design as much as the product. The aim is to connect the evolutionary systems to performance assessment applications, which are used as prioritised fitness functions. This will produce design solutions that respond to their environmental and function requirements. This integrated, interdisciplinary approach to design will produce solutions through a design process that considers and balances the requirements of all aspects of the design. Since this thesis covers a wide area of research material, 'methodological pluralism' approach was used, incorporating both prescriptive and descriptive research methods. Multiple models of research were combined and the overall research was undertaken following three main stages, conceptualisation, developmental and evaluation. The first two stages lay the foundations for the specification of the proposed system where key aspects of the system that have not previously been proven in the literature, were implemented to test the feasibility of the system. As a result of combining the existing knowledge in the area with the newlyverified key aspects of the proposed system, this research can form the base for a future software development project. The evaluation stage, which includes building the prototype system to test and evaluate the system performance based on the criteria defined in the earlier stage, is not within the scope this thesis. The research results in a conceptual design method and a proposed system software architecture. The proposed system is called the 'Hierarchical Evolutionary Algorithmic Design (HEAD) System'. The HEAD system has shown to be feasible through the initial illustrative paper-based simulation. The HEAD system consists of the two main components - 'Design Schema' and the 'Synthesis Algorithms'. The HEAD system reflects the major research contribution in the way it is conceptualised, while secondary contributions are achieved within the system components. The design schema provides constraints on the generation of designs, thus enabling the designer to create a wide range of potential designs that can then be analysed for desirable characteristics. The design schema supports the digital representation of the human creativity of designers into a dynamic design framework that can be encoded and then executed through the use of evolutionary genetic algorithms. The design schema incorporates 2D and 3D geometry and graph theory for space layout planning and building formation using the Lowest Common Design Denominator (LCDD) of a parameterised 2D module and a 3D structural module. This provides a bridge between the standard adjacency requirements and the evolutionary system. The use of graphs as an input to the evolutionary algorithm supports the introduction of constraints in a way that is not supported by standard evolutionary techniques. The process of design synthesis is guided as a higher level description of the building that supports geometrical constraints. The Synthesis Algorithms component analyses designs at four levels, 'Room', 'Layout', 'Building' and 'Optimisation'. At each level multiple fitness functions are embedded into the genetic algorithm to target the specific requirements of the relevant decomposed part of the design problem. Decomposing the design problem to allow for the design requirements of each level to be dealt with separately and then reassembling them in a bottom up approach reduces the generation of non-viable solutions through constraining the options available at the next higher level. The iterative approach, in exploring the range of design solutions through modification of the design schema as the understanding of the design problem improves, assists in identifying conflicts in the design requirements. Additionally, the hierarchical set-up allows the embedding of multiple fitness functions into the genetic algorithm, each relevant to a specific level. This supports an integrated multi-level, multi-disciplinary approach. The HEAD system promotes a collaborative relationship between human creativity and the computer capability. The design schema component, as the input to the procedural algorithms, enables the encoding of certain aspects of the designer's subjective creativity. By focusing on finding solutions for the relevant sub-problems at the appropriate levels of detail, the hierarchical nature of the system assist in the design decision-making process.
Resumo:
The serviceability and safety of bridges are crucial to people’s daily lives and to the national economy. Every effort should be taken to make sure that bridges function safely and properly as any damage or fault during the service life can lead to transport paralysis, catastrophic loss of property or even casualties. Nonetheless, aggressive environmental conditions, ever-increasing and changing traffic loads and aging can all contribute to bridge deterioration. With often constrained budget, it is of significance to identify bridges and bridge elements that should be given higher priority for maintenance, rehabilitation or replacement, and to select optimal strategy. Bridge health prediction is an essential underpinning science to bridge maintenance optimization, since the effectiveness of optimal maintenance decision is largely dependent on the forecasting accuracy of bridge health performance. The current approaches for bridge health prediction can be categorised into two groups: condition ratings based and structural reliability based. A comprehensive literature review has revealed the following limitations of the current modelling approaches: (1) it is not evident in literature to date that any integrated approaches exist for modelling both serviceability and safety aspects so that both performance criteria can be evaluated coherently; (2) complex system modelling approaches have not been successfully applied to bridge deterioration modelling though a bridge is a complex system composed of many inter-related bridge elements; (3) multiple bridge deterioration factors, such as deterioration dependencies among different bridge elements, observed information, maintenance actions and environmental effects have not been considered jointly; (4) the existing approaches are lacking in Bayesian updating ability to incorporate a variety of event information; (5) the assumption of series and/or parallel relationship for bridge level reliability is always held in all structural reliability estimation of bridge systems. To address the deficiencies listed above, this research proposes three novel models based on the Dynamic Object Oriented Bayesian Networks (DOOBNs) approach. Model I aims to address bridge deterioration in serviceability using condition ratings as the health index. The bridge deterioration is represented in a hierarchical relationship, in accordance with the physical structure, so that the contribution of each bridge element to bridge deterioration can be tracked. A discrete-time Markov process is employed to model deterioration of bridge elements over time. In Model II, bridge deterioration in terms of safety is addressed. The structural reliability of bridge systems is estimated from bridge elements to the entire bridge. By means of conditional probability tables (CPTs), not only series-parallel relationship but also complex probabilistic relationship in bridge systems can be effectively modelled. The structural reliability of each bridge element is evaluated from its limit state functions, considering the probability distributions of resistance and applied load. Both Models I and II are designed in three steps: modelling consideration, DOOBN development and parameters estimation. Model III integrates Models I and II to address bridge health performance in both serviceability and safety aspects jointly. The modelling of bridge ratings is modified so that every basic modelling unit denotes one physical bridge element. According to the specific materials used, the integration of condition ratings and structural reliability is implemented through critical failure modes. Three case studies have been conducted to validate the proposed models, respectively. Carefully selected data and knowledge from bridge experts, the National Bridge Inventory (NBI) and existing literature were utilised for model validation. In addition, event information was generated using simulation to demonstrate the Bayesian updating ability of the proposed models. The prediction results of condition ratings and structural reliability were presented and interpreted for basic bridge elements and the whole bridge system. The results obtained from Model II were compared with the ones obtained from traditional structural reliability methods. Overall, the prediction results demonstrate the feasibility of the proposed modelling approach for bridge health prediction and underpin the assertion that the three models can be used separately or integrated and are more effective than the current bridge deterioration modelling approaches. The primary contribution of this work is to enhance the knowledge in the field of bridge health prediction, where more comprehensive health performance in both serviceability and safety aspects are addressed jointly. The proposed models, characterised by probabilistic representation of bridge deterioration in hierarchical ways, demonstrated the effectiveness and pledge of DOOBNs approach to bridge health management. Additionally, the proposed models have significant potential for bridge maintenance optimization. Working together with advanced monitoring and inspection techniques, and a comprehensive bridge inventory, the proposed models can be used by bridge practitioners to achieve increased serviceability and safety as well as maintenance cost effectiveness.
Resumo:
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables.
Resumo:
The use of Bayesian methodologies for solving optimal experimental design problems has increased. Many of these methods have been found to be computationally intensive for design problems that require a large number of design points. A simulation-based approach that can be used to solve optimal design problems in which one is interested in finding a large number of (near) optimal design points for a small number of design variables is presented. The approach involves the use of lower dimensional parameterisations that consist of a few design variables, which generate multiple design points. Using this approach, one simply has to search over a few design variables, rather than searching over a large number of optimal design points, thus providing substantial computational savings. The methodologies are demonstrated on four applications, including the selection of sampling times for pharmacokinetic and heat transfer studies, and involve nonlinear models. Several Bayesian design criteria are also compared and contrasted, as well as several different lower dimensional parameterisation schemes for generating the many design points.