714 resultados para Hierarchical models

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motorcycles are overrepresented in road traffic crashes and particularly vulnerable at signalized intersections. The objective of this study is to identify causal factors affecting the motorcycle crashes at both four-legged and T signalized intersections. Treating the data in time-series cross-section panels, this study explores different Hierarchical Poisson models and found that the model allowing autoregressive lag 1 dependent specification in the error term is the most suitable. Results show that the number of lanes at the four-legged signalized intersections significantly increases motorcycle crashes largely because of the higher exposure resulting from higher motorcycle accumulation at the stop line. Furthermore, the presence of a wide median and an uncontrolled left-turn lane at major roadways of four-legged intersections exacerbate this potential hazard. For T signalized intersections, the presence of exclusive right-turn lane at both major and minor roadways and an uncontrolled left-turn lane at major roadways of T intersections increases motorcycle crashes. Motorcycle crashes increase on high-speed roadways because they are more vulnerable and less likely to react in time during conflicts. The presence of red light cameras reduces motorcycle crashes significantly for both four-legged and T intersections. With the red-light camera, motorcycles are less exposed to conflicts because it is observed that they are more disciplined in queuing at the stop line and less likely to jump start at the start of green.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quality oriented management systems and methods have become the dominant business and governance paradigm. From this perspective, satisfying customers’ expectations by supplying reliable, good quality products and services is the key factor for an organization and even government. During recent decades, Statistical Quality Control (SQC) methods have been developed as the technical core of quality management and continuous improvement philosophy and now are being applied widely to improve the quality of products and services in industrial and business sectors. Recently SQC tools, in particular quality control charts, have been used in healthcare surveillance. In some cases, these tools have been modified and developed to better suit the health sector characteristics and needs. It seems that some of the work in the healthcare area has evolved independently of the development of industrial statistical process control methods. Therefore analysing and comparing paradigms and the characteristics of quality control charts and techniques across the different sectors presents some opportunities for transferring knowledge and future development in each sectors. Meanwhile considering capabilities of Bayesian approach particularly Bayesian hierarchical models and computational techniques in which all uncertainty are expressed as a structure of probability, facilitates decision making and cost-effectiveness analyses. Therefore, this research investigates the use of quality improvement cycle in a health vii setting using clinical data from a hospital. The need of clinical data for monitoring purposes is investigated in two aspects. A framework and appropriate tools from the industrial context are proposed and applied to evaluate and improve data quality in available datasets and data flow; then a data capturing algorithm using Bayesian decision making methods is developed to determine economical sample size for statistical analyses within the quality improvement cycle. Following ensuring clinical data quality, some characteristics of control charts in the health context including the necessity of monitoring attribute data and correlated quality characteristics are considered. To this end, multivariate control charts from an industrial context are adapted to monitor radiation delivered to patients undergoing diagnostic coronary angiogram and various risk-adjusted control charts are constructed and investigated in monitoring binary outcomes of clinical interventions as well as postintervention survival time. Meanwhile, adoption of a Bayesian approach is proposed as a new framework in estimation of change point following control chart’s signal. This estimate aims to facilitate root causes efforts in quality improvement cycle since it cuts the search for the potential causes of detected changes to a tighter time-frame prior to the signal. This approach enables us to obtain highly informative estimates for change point parameters since probability distribution based results are obtained. Using Bayesian hierarchical models and Markov chain Monte Carlo computational methods, Bayesian estimators of the time and the magnitude of various change scenarios including step change, linear trend and multiple change in a Poisson process are developed and investigated. The benefits of change point investigation is revisited and promoted in monitoring hospital outcomes where the developed Bayesian estimator reports the true time of the shifts, compared to priori known causes, detected by control charts in monitoring rate of excess usage of blood products and major adverse events during and after cardiac surgery in a local hospital. The development of the Bayesian change point estimators are then followed in a healthcare surveillances for processes in which pre-intervention characteristics of patients are viii affecting the outcomes. In this setting, at first, the Bayesian estimator is extended to capture the patient mix, covariates, through risk models underlying risk-adjusted control charts. Variations of the estimator are developed to estimate the true time of step changes and linear trends in odds ratio of intensive care unit outcomes in a local hospital. Secondly, the Bayesian estimator is extended to identify the time of a shift in mean survival time after a clinical intervention which is being monitored by riskadjusted survival time control charts. In this context, the survival time after a clinical intervention is also affected by patient mix and the survival function is constructed using survival prediction model. The simulation study undertaken in each research component and obtained results highly recommend the developed Bayesian estimators as a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances as well as industrial and business contexts. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The empirical results and simulations indicate that the Bayesian estimators are a strong alternative in change point estimation within quality improvement cycle in healthcare surveillances. The superiority of the proposed Bayesian framework and estimators are enhanced when probability quantification, flexibility and generalizability of the developed model are also considered. The advantages of the Bayesian approach seen in general context of quality control may also be extended in the industrial and business domains where quality monitoring was initially developed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data are now prevalent in a wide range of fields including environmental and health science. This has led to the development of a range of approaches for analysing patterns in these data. In this paper, we compare several Bayesian hierarchical models for analysing point-based data based on the discretization of the study region, resulting in grid-based spatial data. The approaches considered include two parametric models and a semiparametric model. We highlight the methodology and computation for each approach. Two simulation studies are undertaken to compare the performance of these models for various structures of simulated point-based data which resemble environmental data. A case study of a real dataset is also conducted to demonstrate a practical application of the modelling approaches. Goodness-of-fit statistics are computed to compare estimates of the intensity functions. The deviance information criterion is also considered as an alternative model evaluation criterion. The results suggest that the adaptive Gaussian Markov random field model performs well for highly sparse point-based data where there are large variations or clustering across the space; whereas the discretized log Gaussian Cox process produces good fit in dense and clustered point-based data. One should generally consider the nature and structure of the point-based data in order to choose the appropriate method in modelling a discretized spatial point-based data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

description and analysis of geographically indexed health data with respect to demographic, environmental, behavioural, socioeconomic, genetic, and infectious risk factors (Elliott andWartenberg 2004). Disease maps can be useful for estimating relative risk; ecological analyses, incorporating area and/or individual-level covariates; or cluster analyses (Lawson 2009). As aggregated data are often more readily available, one common method of mapping disease is to aggregate the counts of disease at some geographical areal level, and present them as choropleth maps (Devesa et al. 1999; Population Health Division 2006). Therefore, this chapter will focus exclusively on methods appropriate for areal data...

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Traditional crash prediction models, such as generalized linear regression models, are incapable of taking into account the multilevel data structure, which extensively exists in crash data. Disregarding the possible within-group correlations can lead to the production of models giving unreliable and biased estimates of unknowns. This study innovatively proposes a -level hierarchy, viz. (Geographic region level – Traffic site level – Traffic crash level – Driver-vehicle unit level – Vehicle-occupant level) Time level, to establish a general form of multilevel data structure in traffic safety analysis. To properly model the potential cross-group heterogeneity due to the multilevel data structure, a framework of Bayesian hierarchical models that explicitly specify multilevel structure and correctly yield parameter estimates is introduced and recommended. The proposed method is illustrated in an individual-severity analysis of intersection crashes using the Singapore crash records. This study proved the importance of accounting for the within-group correlations and demonstrated the flexibilities and effectiveness of the Bayesian hierarchical method in modeling multilevel structure of traffic crash data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This study proposes a full Bayes (FB) hierarchical modeling approach in traffic crash hotspot identification. The FB approach is able to account for all uncertainties associated with crash risk and various risk factors by estimating a posterior distribution of the site safety on which various ranking criteria could be based. Moreover, by use of hierarchical model specification, FB approach is able to flexibly take into account various heterogeneities of crash occurrence due to spatiotemporal effects on traffic safety. Using Singapore intersection crash data(1997-2006), an empirical evaluate was conducted to compare the proposed FB approach to the state-of-the-art approaches. Results show that the Bayesian hierarchical models with accommodation for site specific effect and serial correlation have better goodness-of-fit than non hierarchical models. Furthermore, all model-based approaches perform significantly better in safety ranking than the naive approach using raw crash count. The FB hierarchical models were found to significantly outperform the standard EB approach in correctly identifying hotspots.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In the current business world which companies’ competition is very compact in the business arena, quality in manufacturing and providing products and services can be considered as a means of seeking excellence and success of companies in this competition arena. Entering the era of e-commerce and emergence of new production systems and new organizational structures, traditional management and quality assurance systems have been challenged. Consequently, quality information system has been gained a special seat as one of the new tools of quality management. In this paper, quality information system has been studied with a review of the literature of the quality information system, and the role and position of quality Information System (QIS) among other information systems of a organization is investigated. The quality Information system models are analyzed and by analyzing and assessing presented models in quality information system a conceptual and hierarchical model of quality information system is suggested and studied. As a case study the hierarchical model of quality information system is developed by evaluating hierarchical models presented in the field of quality information system based on the Shetabkar Co.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Achieving health equity has been identified as a major challenge, both internationally and within Australia. Inequalities in cancer outcomes are well documented, and must be quantified before they can be addressed. One method of portraying geographical variation in data uses maps. Recently we have produced thematic maps showing the geographical variation in cancer incidence and survival across Queensland, Australia. This article documents the decisions and rationale used in producing these maps, with the aim to assist others in producing chronic disease atlases. Methods: Bayesian hierarchical models were used to produce the estimates. Justification for the cancers chosen, geographical areas used, modelling method, outcome measures mapped, production of the adjacency matrix, assessment of convergence, sensitivity analyses performed and determination of significant geographical variation is provided. Conclusions: Although careful consideration of many issues is required, chronic disease atlases are a useful tool for assessing and quantifying geographical inequalities. In addition they help focus research efforts to investigate why the observed inequalities exist, which in turn inform advocacy, policy, support and education programs designed to reduce these inequalities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Precise identification of the time when a change in a hospital outcome has occurred enables clinical experts to search for a potential special cause more effectively. In this paper, we develop change point estimation methods for survival time of a clinical procedure in the presence of patient mix in a Bayesian framework. We apply Bayesian hierarchical models to formulate the change point where there exists a step change in the mean survival time of patients who underwent cardiac surgery. The data are right censored since the monitoring is conducted over a limited follow-up period. We capture the effect of risk factors prior to the surgery using a Weibull accelerated failure time regression model. Markov Chain Monte Carlo is used to obtain posterior distributions of the change point parameters including location and magnitude of changes and also corresponding probabilistic intervals and inferences. The performance of the Bayesian estimator is investigated through simulations and the result shows that precise estimates can be obtained when they are used in conjunction with the risk-adjusted survival time CUSUM control charts for different magnitude scenarios. The proposed estimator shows a better performance where a longer follow-up period, censoring time, is applied. In comparison with the alternative built-in CUSUM estimator, more accurate and precise estimates are obtained by the Bayesian estimator. These superiorities are enhanced when probability quantification, flexibility and generalizability of the Bayesian change point detection model are also considered.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study proposes a framework of a model-based hot spot identification method by applying full Bayes (FB) technique. In comparison with the state-of-the-art approach [i.e., empirical Bayes method (EB)], the advantage of the FB method is the capability to seamlessly integrate prior information and all available data into posterior distributions on which various ranking criteria could be based. With intersection crash data collected in Singapore, an empirical analysis was conducted to evaluate the following six approaches for hot spot identification: (a) naive ranking using raw crash data, (b) standard EB ranking, (c) FB ranking using a Poisson-gamma model, (d) FB ranking using a Poisson-lognormal model, (e) FB ranking using a hierarchical Poisson model, and (f) FB ranking using a hierarchical Poisson (AR-1) model. The results show that (a) when using the expected crash rate-related decision parameters, all model-based approaches perform significantly better in safety ranking than does the naive ranking method, and (b) the FB approach using hierarchical models significantly outperforms the standard EB approach in correctly identifying hazardous sites.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study examines the influence of cancer stage, distance to treatment facilities and area disadvantage on breast and colorectal cancer spatial survival inequalities. We also estimate the number of premature deaths after adjusting for cancer stage to quantify the impact of spatial survival inequalities. Population-based descriptive study of residents aged <90 years in Queensland, Australia diagnosed with primary invasive breast (25,202 females) or colorectal (14,690 males, 11,700 females) cancers during 1996-2007. Bayesian hierarchical models explored relative survival inequalities across 478 regions. Cancer stage and disadvantage explained the spatial inequalities in breast cancer survival, however spatial inequalities in colorectal cancer survival persisted after adjustment. Of the 6,019 colorectal cancer deaths within 5 years of diagnosis, 470 (8%) were associated with spatial inequalities in non-diagnostic factors, i.e. factors beyond cancer stage at diagnosis. For breast cancers, of 2,412 deaths, 170 (7%) were related to spatial inequalities in non-diagnostic factors. Quantifying premature deaths can increase incentive for action to reduce these spatial inequalities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of hierarchical Bayesian spatial models in the analysis of ecological data is increasingly prevalent. The implementation of these models has been heretofore limited to specifically written software that required extensive programming knowledge to create. The advent of WinBUGS provides access to Bayesian hierarchical models for those without the programming expertise to create their own models and allows for the more rapid implementation of new models and data analysis. This facility is demonstrated here using data collected by the Missouri Department of Conservation for the Missouri Turkey Hunting Survey of 1996. Three models are considered, the first uses the collected data to estimate the success rate for individual hunters at the county level and incorporates a conditional autoregressive (CAR) spatial effect. The second model builds upon the first by simultaneously estimating the success rate and harvest at the county level, while the third estimates the success rate and hunting pressure at the county level. These models are discussed in detail as well as their implementation in WinBUGS and the issues arising therein. Future areas of application for WinBUGS and the latest developments in WinBUGS are discussed as well.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ecological studies are based on characteristics of groups of individuals, which are common in various disciplines including epidemiology. It is of great interest for epidemiologists to study the geographical variation of a disease by accounting for the positive spatial dependence between neighbouring areas. However, the choice of scale of the spatial correlation requires much attention. In view of a lack of studies in this area, this study aims to investigate the impact of differing definitions of geographical scales using a multilevel model. We propose a new approach -- the grid-based partitions and compare it with the popular census region approach. Unexplained geographical variation is accounted for via area-specific unstructured random effects and spatially structured random effects specified as an intrinsic conditional autoregressive process. Using grid-based modelling of random effects in contrast to the census region approach, we illustrate conditions where improvements are observed in the estimation of the linear predictor, random effects, parameters, and the identification of the distribution of residual risk and the aggregate risk in a study region. The study has found that grid-based modelling is a valuable approach for spatially sparse data while the SLA-based and grid-based approaches perform equally well for spatially dense data.