289 resultados para Approximate Bayesian computation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

So far, most Phase II trials have been designed and analysed under a frequentist framework. Under this framework, a trial is designed so that the overall Type I and Type II errors of the trial are controlled at some desired levels. Recently, a number of articles have advocated the use of Bavesian designs in practice. Under a Bayesian framework, a trial is designed so that the trial stops when the posterior probability of treatment is within certain prespecified thresholds. In this article, we argue that trials under a Bayesian framework can also be designed to control frequentist error rates. We introduce a Bayesian version of Simon's well-known two-stage design to achieve this goal. We also consider two other errors, which are called Bayesian errors in this article because of their similarities to posterior probabilities. We show that our method can also control these Bayesian-type errors. We compare our method with other recent Bayesian designs in a numerical study and discuss implications of different designs on error rates. An example of a clinical trial for patients with nasopharyngeal carcinoma is used to illustrate differences of the different designs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Stallard (1998, Biometrics 54, 279-294) recently used Bayesian decision theory for sample-size determination in phase II trials. His design maximizes the expected financial gains in the development of a new treatment. However, it results in a very high probability (0.65) of recommending an ineffective treatment for phase III testing. On the other hand, the expected gain using his design is more than 10 times that of a design that tightly controls the false positive error (Thall and Simon, 1994, Biometrics 50, 337-349). Stallard's design maximizes the expected gain per phase II trial, but it does not maximize the rate of gain or total gain for a fixed length of time because the rate of gain depends on the proportion: of treatments forwarding to the phase III study. We suggest maximizing the rate of gain, and the resulting optimal one-stage design becomes twice as efficient as Stallard's one-stage design. Furthermore, the new design has a probability of only 0.12 of passing an ineffective treatment to phase III study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

description and analysis of geographically indexed health data with respect to demographic, environmental, behavioural, socioeconomic, genetic, and infectious risk factors (Elliott andWartenberg 2004). Disease maps can be useful for estimating relative risk; ecological analyses, incorporating area and/or individual-level covariates; or cluster analyses (Lawson 2009). As aggregated data are often more readily available, one common method of mapping disease is to aggregate the counts of disease at some geographical areal level, and present them as choropleth maps (Devesa et al. 1999; Population Health Division 2006). Therefore, this chapter will focus exclusively on methods appropriate for areal data...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Being able to accurately predict the risk of falling is crucial in patients with Parkinson’s dis- ease (PD). This is due to the unfavorable effect of falls, which can lower the quality of life as well as directly impact on survival. Three methods considered for predicting falls are decision trees (DT), Bayesian networks (BN), and support vector machines (SVM). Data on a 1-year prospective study conducted at IHBI, Australia, for 51 people with PD are used. Data processing are conducted using rpart and e1071 packages in R for DT and SVM, con- secutively; and Bayes Server 5.5 for the BN. The results show that BN and SVM produce consistently higher accuracy over the 12 months evaluation time points (average sensitivity and specificity > 92%) than DT (average sensitivity 88%, average specificity 72%). DT is prone to imbalanced data so needs to adjust for the misclassification cost. However, DT provides a straightforward, interpretable result and thus is appealing for helping to identify important items related to falls and to generate fallers’ profiles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We carried out a discriminant analysis with identity by descent (IBD) at each marker as inputs, and the sib pair type (affected-affected versus affected-unaffected) as the output. Using simple logistic regression for this discriminant analysis, we illustrate the importance of comparing models with different number of parameters. Such model comparisons are best carried out using either the Akaike information criterion (AIC) or the Bayesian information criterion (BIC). When AIC (or BIC) stepwise variable selection was applied to the German Asthma data set, a group of markers were selected which provide the best fit to the data (assuming an additive effect). Interestingly, these 25-26 markers were not identical to those with the highest (in magnitude) single-locus lod scores.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background A pandemic strain of influenza A spread rapidly around the world in 2009, now referred to as pandemic (H1N1) 2009. This study aimed to examine the spatiotemporal variation in the transmission rate of pandemic (H1N1) 2009 associated with changes in local socio-environmental conditions from May 7–December 31, 2009, at a postal area level in Queensland, Australia. Method We used the data on laboratory-confirmed H1N1 cases to examine the spatiotemporal dynamics of transmission using a flexible Bayesian, space–time, Susceptible-Infected-Recovered (SIR) modelling approach. The model incorporated parameters describing spatiotemporal variation in H1N1 infection and local socio-environmental factors. Results The weekly transmission rate of pandemic (H1N1) 2009 was negatively associated with the weekly area-mean maximum temperature at a lag of 1 week (LMXT) (posterior mean: −0.341; 95% credible interval (CI): −0.370–−0.311) and the socio-economic index for area (SEIFA) (posterior mean: −0.003; 95% CI: −0.004–−0.001), and was positively associated with the product of LMXT and the weekly area-mean vapour pressure at a lag of 1 week (LVAP) (posterior mean: 0.008; 95% CI: 0.007–0.009). There was substantial spatiotemporal variation in transmission rate of pandemic (H1N1) 2009 across Queensland over the epidemic period. High random effects of estimated transmission rates were apparent in remote areas and some postal areas with higher proportion of indigenous populations and smaller overall populations. Conclusions Local SEIFA and local atmospheric conditions were associated with the transmission rate of pandemic (H1N1) 2009. The more populated regions displayed consistent and synchronized epidemics with low average transmission rates. The less populated regions had high average transmission rates with more variations during the H1N1 epidemic period.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we examine approaches to estimate a Bayesian mixture model at both single and multiple time points for a sample of actual and simulated aerosol particle size distribution (PSD) data. For estimation of a mixture model at a single time point, we use Reversible Jump Markov Chain Monte Carlo (RJMCMC) to estimate mixture model parameters including the number of components which is assumed to be unknown. We compare the results of this approach to a commonly used estimation method in the aerosol physics literature. As PSD data is often measured over time, often at small time intervals, we also examine the use of an informative prior for estimation of the mixture parameters which takes into account the correlated nature of the parameters. The Bayesian mixture model offers a promising approach, providing advantages both in estimation and inference.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use Bayesian model selection techniques to test extensions of the standard flat LambdaCDM paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. The Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Besides, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent CMB, SNIa and BAO data, we find inconclusive evidence between flat LambdaCDM and simple dark-energy models. A curved Universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r=0.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this note, we shortly survey some recent approaches on the approximation of the Bayes factor used in Bayesian hypothesis testing and in Bayesian model choice. In particular, we reassess importance sampling, harmonic mean sampling, and nested sampling from a unified perspective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We review the literature on the combined association between lung cancer and two environmental exposures, asbestos exposure and smoking, and explore a Bayesian approach to assess evidence of interaction between the exposures. The meta-analysis combines separate indices of additive and multiplicative relationships and multivariate relative risk estimates. By making inferences on posterior probabilities we can explore both the form and strength of interaction. This analysis may be more informative than providing evidence to support one relation over another on the basis of statistical significance. Overall, we find evidence for a more than additive and less than multiplicative relation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The current state of the practice in Blackspot Identification (BSI) utilizes safety performance functions based on total crash counts to identify transport system sites with potentially high crash risk. This paper postulates that total crash count variation over a transport network is a result of multiple distinct crash generating processes including geometric characteristics of the road, spatial features of the surrounding environment, and driver behaviour factors. However, these multiple sources are ignored in current modelling methodologies in both trying to explain or predict crash frequencies across sites. Instead, current practice employs models that imply that a single underlying crash generating process exists. The model mis-specification may lead to correlating crashes with the incorrect sources of contributing factors (e.g. concluding a crash is predominately caused by a geometric feature when it is a behavioural issue), which may ultimately lead to inefficient use of public funds and misidentification of true blackspots. This study aims to propose a latent class model consistent with a multiple crash process theory, and to investigate the influence this model has on correctly identifying crash blackspots. We first present the theoretical and corresponding methodological approach in which a Bayesian Latent Class (BLC) model is estimated assuming that crashes arise from two distinct risk generating processes including engineering and unobserved spatial factors. The Bayesian model is used to incorporate prior information about the contribution of each underlying process to the total crash count. The methodology is applied to the state-controlled roads in Queensland, Australia and the results are compared to an Empirical Bayesian Negative Binomial (EB-NB) model. A comparison of goodness of fit measures illustrates significantly improved performance of the proposed model compared to the NB model. The detection of blackspots was also improved when compared to the EB-NB model. In addition, modelling crashes as the result of two fundamentally separate underlying processes reveals more detailed information about unobserved crash causes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Disease maps are effective tools for explaining and predicting patterns of disease outcomes across geographical space, identifying areas of potentially elevated risk, and formulating and validating aetiological hypotheses for a disease. Bayesian models have become a standard approach to disease mapping in recent decades. This article aims to provide a basic understanding of the key concepts involved in Bayesian disease mapping methods for areal data. It is anticipated that this will help in interpretation of published maps, and provide a useful starting point for anyone interested in running disease mapping methods for areal data. The article provides detailed motivation and descriptions on disease mapping methods by explaining the concepts, defining the technical terms, and illustrating the utility of disease mapping for epidemiological research by demonstrating various ways of visualising model outputs using a case study. The target audience includes spatial scientists in health and other fields, policy or decision makers, health geographers, spatial analysts, public health professionals, and epidemiologists.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we numerically model isothermal turbulent swirling flow in a cylindrical burner. Three versions of the RNG k-epsilon model are assessed against performance of the standard k-epsilon model. Sensitivity of numerical predictions to grid refinement, differing convective differencing schemes and choice of (unknown) inlet dissipation rate, were closely scrutinised to ensure accuracy. Particular attention is paid to modelling the inlet conditions to within the range of uncertainty of the experimental data, as model predictions proved to be significantly sensitive to relatively small changes in upstream flow conditions. We also examine the characteristics of the swirl--induced recirculation zone predicted by the models over an extended range of inlet conditions. Our main findings are: - (i) the standard k-epsilon model performed best compared with experiment; - (ii) no one inlet specification can simultaneously optimize the performance of the models considered; - (iii) the RNG models predict both single-cell and double-cell IRZ characteristics, the latter both with and without additional internal stagnation points. The first finding indicates that the examined RNG modifications to the standard k-e model do not result in an improved eddy viscosity based model for the prediction of swirl flows. The second finding suggests that tuning established models for optimal performance in swirl flows a priori is not straightforward. The third finding indicates that the RNG based models exhibit a greater variety of structural behaviour, despite being of the same level of complexity as the standard k-e model. The plausibility of the predicted IRZ features are discussed in terms of known vortex breakdown phenomena.