36 resultados para Priors

em Queensland University of Technology - ePrints Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report three developments toward resolving the challenge of the apparent basal polytomy of neoavian birds. First, we describe improved conditional down-weighting techniques to reduce noise relative to signal for deeper divergences and find increased agreement between data sets. Second, we present formulae for calculating the probabilities of finding predefined groupings in the optimal tree. Finally, we report a significant increase in data: nine new mitochondrial (mt) genomes (the dollarbird, New Zealand kingfisher, great potoo, Australian owlet-nightjar, white-tailed trogon, barn owl, a roadrunner [a ground cuckoo], New Zealand long-tailed cuckoo, and the peach-faced lovebird) and together they provide data for each of the six main groups of Neoaves proposed by Cracraft J (2001). We use his six main groups of modern birds as priors for evaluation of results. These include passerines, cuckoos, parrots, and three other groups termed “WoodKing” (woodpeckers/rollers/kingfishers), “SCA” (owls/potoos/owlet-nightjars/hummingbirds/swifts), and “Conglomerati.” In general, the support is highly significant with just two exceptions, the owls move from the “SCA” group to the raptors, particularly accipitrids (buzzards/eagles) and the osprey, and the shorebirds may be an independent group from the rest of the “Conglomerati”. Molecular dating mt genomes support a major diversification of at least 12 neoavian lineages in the Late Cretaceous. Our results form a basis for further testing with both nuclear-coding sequences and rare genomic changes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information that is elicited from experts can be treated as `data', so can be analysed using a Bayesian statistical model, to formulate a prior model. Typically methods for encoding a single expert's knowledge have been parametric, constrained by the extent of an expert's knowledge and energy regarding a target parameter. Interestingly these methods have often been deterministic, in that all elicited information is treated at `face value', without error. Here we sought a parametric and statistical approach for encoding assessments from multiple experts. Our recent work proposed and demonstrated the use of a flexible hierarchical model for this purpose. In contrast to previous mathematical approaches like linear or geometric pooling, our new approach accounts for several sources of variation: elicitation error, encoding error and expert diversity. Of interest are the practical, mathematical and philosophical interpretations of this form of hierarchical pooling (which is both statistical and parametric), and how it fits within the subjective Bayesian paradigm. Case studies from a bioassay and project management (on PhDs) are used to illustrate the approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: Flat-detector, cone-beam computed tomography (CBCT) has enormous potential to improve the accuracy of treatment delivery in image-guided radiotherapy (IGRT). To assist radiotherapists in interpreting these images, we use a Bayesian statistical model to label each voxel according to its tissue type. Methods: The rich sources of prior information in IGRT are incorporated into a hidden Markov random field (MRF) model of the 3D image lattice. Tissue densities in the reference CT scan are estimated using inverse regression and then rescaled to approximate the corresponding CBCT intensity values. The treatment planning contours are combined with published studies of physiological variability to produce a spatial prior distribution for changes in the size, shape and position of the tumour volume and organs at risk (OAR). The voxel labels are estimated using the iterated conditional modes (ICM) algorithm. Results: The accuracy of the method has been evaluated using 27 CBCT scans of an electron density phantom (CIRS, Inc. model 062). The mean voxel-wise misclassification rate was 6.2%, with Dice similarity coefficient of 0.73 for liver, muscle, breast and adipose tissue. Conclusions: By incorporating prior information, we are able to successfully segment CBCT images. This could be a viable approach for automated, online image analysis in radiotherapy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cone-beam computed tomography (CBCT) has enormous potential to improve the accuracy of treatment delivery in image-guided radiotherapy (IGRT). To assist radiotherapists in interpreting these images, we use a Bayesian statistical model to label each voxel according to its tissue type. The rich sources of prior information in IGRT are incorporated into a hidden Markov random field model of the 3D image lattice. Tissue densities in the reference CT scan are estimated using inverse regression and then rescaled to approximate the corresponding CBCT intensity values. The treatment planning contours are combined with published studies of physiological variability to produce a spatial prior distribution for changes in the size, shape and position of the tumour volume and organs at risk. The voxel labels are estimated using iterated conditional modes. The accuracy of the method has been evaluated using 27 CBCT scans of an electron density phantom. The mean voxel-wise misclassification rate was 6.2\%, with Dice similarity coefficient of 0.73 for liver, muscle, breast and adipose tissue. By incorporating prior information, we are able to successfully segment CBCT images. This could be a viable approach for automated, online image analysis in radiotherapy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The issue of using informative priors for estimation of mixtures at multiple time points is examined. Several different informative priors and an independent prior are compared using samples of actual and simulated aerosol particle size distribution (PSD) data. Measurements of aerosol PSDs refer to the concentration of aerosol particles in terms of their size, which is typically multimodal in nature and collected at frequent time intervals. The use of informative priors is found to better identify component parameters at each time point and more clearly establish patterns in the parameters over time. Some caveats to this finding are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Given the drawbacks for using geo-political areas in mapping outcomes unrelated to geo-politics, a compromise is to aggregate and analyse data at the grid level. This has the advantage of allowing spatial smoothing and modelling at a biologically or physically relevant scale. This article addresses two consequent issues: the choice of the spatial smoothness prior and the scale of the grid. Firstly, we describe several spatial smoothness priors applicable for grid data and discuss the contexts in which these priors can be employed based on different aims. Two such aims are considered, i.e., to identify regions with clustering and to model spatial dependence in the data. Secondly, the choice of the grid size is shown to depend largely on the spatial patterns. We present a guide on the selection of spatial scales and smoothness priors for various point patterns based on the two aims for spatial smoothing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reconstructing 3D motion data is highly under-constrained due to several common sources of data loss during measurement, such as projection, occlusion, or miscorrespondence. We present a statistical model of 3D motion data, based on the Kronecker structure of the spatiotemporal covariance of natural motion, as a prior on 3D motion. This prior is expressed as a matrix normal distribution, composed of separable and compact row and column covariances. We relate the marginals of the distribution to the shape, trajectory, and shape-trajectory models of prior art. When the marginal shape distribution is not available from training data, we show how placing a hierarchical prior over shapes results in a convex MAP solution in terms of the trace-norm. The matrix normal distribution, fit to a single sequence, outperforms state-of-the-art methods at reconstructing 3D motion data in the presence of significant data loss, while providing covariance estimates of the imputed points.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

1. Species' distribution modelling relies on adequate data sets to build reliable statistical models with high predictive ability. However, the money spent collecting empirical data might be better spent on management. A less expensive source of species' distribution information is expert opinion. This study evaluates expert knowledge and its source. In particular, we determine whether models built on expert knowledge apply over multiple regions or only within the region where the knowledge was derived. 2. The case study focuses on the distribution of the brush-tailed rock-wallaby Petrogale penicillata in eastern Australia. We brought together from two biogeographically different regions substantial and well-designed field data and knowledge from nine experts. We used a novel elicitation tool within a geographical information system to systematically collect expert opinions. The tool utilized an indirect approach to elicitation, asking experts simpler questions about observable rather than abstract quantities, with measures in place to identify uncertainty and offer feedback. Bayesian analysis was used to combine field data and expert knowledge in each region to determine: (i) how expert opinion affected models based on field data and (ii) how similar expert-informed models were within regions and across regions. 3. The elicitation tool effectively captured the experts' opinions and their uncertainties. Experts were comfortable with the map-based elicitation approach used, especially with graphical feedback. Experts tended to predict lower values of species occurrence compared with field data. 4. Across experts, consensus on effect sizes occurred for several habitat variables. Expert opinion generally influenced predictions from field data. However, south-east Queensland and north-east New South Wales experts had different opinions on the influence of elevation and geology, with these differences attributable to geological differences between these regions. 5. Synthesis and applications. When formulated as priors in Bayesian analysis, expert opinion is useful for modifying or strengthening patterns exhibited by empirical data sets that are limited in size or scope. Nevertheless, the ability of an expert to extrapolate beyond their region of knowledge may be poor. Hence there is significant merit in obtaining information from local experts when compiling species' distribution models across several regions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Numerous expert elicitation methods have been suggested for generalised linear models (GLMs). This paper compares three relatively new approaches to eliciting expert knowledge in a form suitable for Bayesian logistic regression. These methods were trialled on two experts in order to model the habitat suitability of the threatened Australian brush-tailed rock-wallaby (Petrogale penicillata). The first elicitation approach is a geographically assisted indirect predictive method with a geographic information system (GIS) interface. The second approach is a predictive indirect method which uses an interactive graphical tool. The third method uses a questionnaire to elicit expert knowledge directly about the impact of a habitat variable on the response. Two variables (slope and aspect) are used to examine prior and posterior distributions of the three methods. The results indicate that there are some similarities and dissimilarities between the expert informed priors of the two experts formulated from the different approaches. The choice of elicitation method depends on the statistical knowledge of the expert, their mapping skills, time constraints, accessibility to experts and funding available. This trial reveals that expert knowledge can be important when modelling rare event data, such as threatened species, because experts can provide additional information that may not be represented in the dataset. However care must be taken with the way in which this information is elicited and formulated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.