609 resultados para Bayesian methods

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cancer is the leading contributor to the disease burden in Australia. This thesis develops and applies Bayesian hierarchical models to facilitate an investigation of the spatial and temporal associations for cancer diagnosis and survival among Queenslanders. The key objectives are to document and quantify the importance of spatial inequalities, explore factors influencing these inequalities, and investigate how spatial inequalities change over time. Existing Bayesian hierarchical models are refined, new models and methods developed, and tangible benefits obtained for cancer patients in Queensland. The versatility of using Bayesian models in cancer control are clearly demonstrated through these detailed and comprehensive analyses.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Objective We aimed to predict sub-national spatial variation in numbers of people infected with Schistosoma haematobium, and associated uncertainties, in Burkina Faso, Mali and Niger, prior to implementation of national control programmes. Methods We used national field survey datasets covering a contiguous area 2,750 × 850 km, from 26,790 school-aged children (5–14 years) in 418 schools. Bayesian geostatistical models were used to predict prevalence of high and low intensity infections and associated 95% credible intervals (CrI). Numbers infected were determined by multiplying predicted prevalence by numbers of school-aged children in 1 km2 pixels covering the study area. Findings Numbers of school-aged children with low-intensity infections were: 433,268 in Burkina Faso, 872,328 in Mali and 580,286 in Niger. Numbers with high-intensity infections were: 416,009 in Burkina Faso, 511,845 in Mali and 254,150 in Niger. 95% CrIs (indicative of uncertainty) were wide; e.g. the mean number of boys aged 10–14 years infected in Mali was 140,200 (95% CrI 6200, 512,100). Conclusion National aggregate estimates for numbers infected mask important local variation, e.g. most S. haematobium infections in Niger occur in the Niger River valley. Prevalence of high-intensity infections was strongly clustered in foci in western and central Mali, north-eastern and northwestern Burkina Faso and the Niger River valley in Niger. Populations in these foci are likely to carry the bulk of the urinary schistosomiasis burden and should receive priority for schistosomiasis control. Uncertainties in predicted prevalence and numbers infected should be acknowledged and taken into consideration by control programme planners.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Regional safety program managers face a daunting challenge in the attempt to reduce deaths, injuries, and economic losses that result from motor vehicle crashes. This difficult mission is complicated by the combination of a large perceived need, small budget, and uncertainty about how effective each proposed countermeasure would be if implemented. A manager can turn to the research record for insight, but the measured effect of a single countermeasure often varies widely from study to study and across jurisdictions. The challenge of converting widespread and conflicting research results into a regionally meaningful conclusion can be addressed by incorporating "subjective" information into a Bayesian analysis framework. Engineering evaluations of crashes provide the subjective input on countermeasure effectiveness in the proposed Bayesian analysis framework. Empirical Bayes approaches are widely used in before-and-after studies and "hot-spot" identification; however, in these cases, the prior information was typically obtained from the data (empirically), not subjective sources. The power and advantages of Bayesian methods for assessing countermeasure effectiveness are presented. Also, an engineering evaluation approach developed at the Georgia Institute of Technology is described. Results are presented from an experiment conducted to assess the repeatability and objectivity of subjective engineering evaluations. In particular, the focus is on the importance, methodology, and feasibility of the subjective engineering evaluation for assessing countermeasures.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper describes the formalization and application of a methodology to evaluate the safety benefit of countermeasures in the face of uncertainty. To illustrate the methodology, 18 countermeasures for improving safety of at grade railroad crossings (AGRXs) in the Republic of Korea are considered. Akin to “stated preference” methods in travel survey research, the methodology applies random selection and laws of large numbers to derive accident modification factor (AMF) densities from expert opinions. In a full Bayesian analysis framework, the collective opinions in the form of AMF densities (data likelihood) are combined with prior knowledge (AMF density priors) for the 18 countermeasures to obtain ‘best’ estimates of AMFs (AMF posterior credible intervals). The countermeasures are then compared and recommended based on the largest safety returns with minimum risk (uncertainty). To the author's knowledge the complete methodology is new and has not previously been applied or reported in the literature. The results demonstrate that the methodology is able to discern anticipated safety benefit differences across candidate countermeasures. For the 18 at grade railroad crossings considered in this analysis, it was found that the top three performing countermeasures for reducing crashes are in-vehicle warning systems, obstacle detection systems, and constant warning time systems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper proposes the use of Bayesian approaches with the cross likelihood ratio (CLR) as a criterion for speaker clustering within a speaker diarization system, using eigenvoice modeling techniques. The CLR has previously been shown to be an effective decision criterion for speaker clustering using Gaussian mixture models. Recently, eigenvoice modeling has become an increasingly popular technique, due to its ability to adequately represent a speaker based on sparse training data, as well as to provide an improved capture of differences in speaker characteristics. The integration of eigenvoice modeling into the CLR framework to capitalize on the advantage of both techniques has also been shown to be beneficial for the speaker clustering task. Building on that success, this paper proposes the use of Bayesian methods to compute the conditional probabilities in computing the CLR, thus effectively combining the eigenvoice-CLR framework with the advantages of a Bayesian approach to the diarization problem. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 33.5% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Obtaining attribute values of non-chosen alternatives in a revealed preference context is challenging because non-chosen alternative attributes are unobserved by choosers, chooser perceptions of attribute values may not reflect reality, existing methods for imputing these values suffer from shortcomings, and obtaining non-chosen attribute values is resource intensive. This paper presents a unique Bayesian (multiple) Imputation Multinomial Logit model that imputes unobserved travel times and distances of non-chosen travel modes based on random draws from the conditional posterior distribution of missing values. The calibrated Bayesian (multiple) Imputation Multinomial Logit model imputes non-chosen time and distance values that convincingly replicate observed choice behavior. Although network skims were used for calibration, more realistic data such as supplemental geographically referenced surveys or stated preference data may be preferred. The model is ideally suited for imputing variation in intrazonal non-chosen mode attributes and for assessing the marginal impacts of travel policies, programs, or prices within traffic analysis zones.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Preventing risk factor exposure is vital to reduce the high burden from lung cancer. The leading risk factor for developing lung cancer is tobacco smoking. In Australia, despite apparent success in reducing smoking prevalence, there is limited information on small area patterns and small area temporal trends. We sought to estimate spatio-temporal patterns for lung cancer risk factors using routinely collected population-based cancer data. Methods: The analysis used a Bayesian shared component spatio-temporal model, with male and female lung cancer included separately. The shared component reflected exposure to lung cancer risk factors, and was modelled over 477 statistical local areas (SLAs) and 15 years in Queensland, Australia. Analyses were also run adjusting for area-level socioeconomic disadvantage, Indigenous population composition, or remoteness. Results: Strong spatial patterns were observed in the underlying risk factor exposure for both males (median Relative Risk (RR) across SLAs compared to the Queensland average ranged from 0.48-2.00) and females (median RR range across SLAs 0.53-1.80), with high exposure observed in many remote areas. Strong temporal trends were also observed. Males showed a decrease in the underlying risk across time, while females showed an increase followed by a decrease in the final two years. These patterns were largely consistent across each SLA. The high underlying risk estimates observed among disadvantaged, remote and indigenous areas decreased after adjustment, particularly among females. Conclusion: The modelled underlying exposure appeared to reflect previous smoking prevalence, with a lag period of around 30 years, consistent with the time taken to develop lung cancer. The consistent temporal trends in lung cancer risk factors across small areas support the hypothesis that past interventions have been equally effective across the state. However, this also means that spatial inequalities have remained unaddressed, highlighting the potential for future interventions, particularly among remote areas.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Provides an accessible foundation to Bayesian analysis using real world models This book aims to present an introduction to Bayesian modelling and computation, by considering real case studies drawn from diverse fields spanning ecology, health, genetics and finance. Each chapter comprises a description of the problem, the corresponding model, the computational method, results and inferences as well as the issues that arise in the implementation of these approaches. Case Studies in Bayesian Statistical Modelling and Analysis: •Illustrates how to do Bayesian analysis in a clear and concise manner using real-world problems. •Each chapter focuses on a real-world problem and describes the way in which the problem may be analysed using Bayesian methods. •Features approaches that can be used in a wide area of application, such as, health, the environment, genetics, information science, medicine, biology, industry and remote sensing. Case Studies in Bayesian Statistical Modelling and Analysis is aimed at statisticians, researchers and practitioners who have some expertise in statistical modelling and analysis, and some understanding of the basics of Bayesian statistics, but little experience in its application. Graduate students of statistics and biostatistics will also find this book beneficial.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A flexible and simple Bayesian decision-theoretic design for dose-finding trials is proposed in this paper. In order to reduce the computational burden, we adopt a working model with conjugate priors, which is flexible to fit all monotonic dose-toxicity curves and produces analytic posterior distributions. We also discuss how to use a proper utility function to reflect the interest of the trial. Patients are allocated based on not only the utility function but also the chosen dose selection rule. The most popular dose selection rule is the one-step-look-ahead (OSLA), which selects the best-so-far dose. A more complicated rule, such as the two-step-look-ahead, is theoretically more efficient than the OSLA only when the required distributional assumptions are met, which is, however, often not the case in practice. We carried out extensive simulation studies to evaluate these two dose selection rules and found that OSLA was often more efficient than two-step-look-ahead under the proposed Bayesian structure. Moreover, our simulation results show that the proposed Bayesian method's performance is superior to several popular Bayesian methods and that the negative impact of prior misspecification can be managed in the design stage.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Samples of sea water contain phytoplankton taxa in varying amounts, and marine scientists are interested in the relative abundance of each taxa. Their relative biomass can be ascertained indirectly by measuring the quantity of various pigments using high performance liquid chromatography. However, the conversion from pigment to taxa is mathematically non trivial as it is a positive matrix factorisation problem where both matrices are unknown beyond the level of initial estimates. The prior information on the pigment to taxa conversion matrix is used to give the problem a unique solution. An iteration of two non-negative least squares algorithms gives satisfactory results. Some sample analysis of data indicates prospects for this type of analysis. An alternative more computationally intensive approach using Bayesian methods is discussed.