944 resultados para Método de Monte Carlo via cadeias de Markov


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We evaluate the performance of several specification tests for Markov regime-switching time-series models. We consider the Lagrange multiplier (LM) and dynamic specification tests of Hamilton (1996) and Ljung–Box tests based on both the generalized residual and a standard-normal residual constructed using the Rosenblatt transformation. The size and power of the tests are studied using Monte Carlo experiments. We find that the LM tests have the best size and power properties. The Ljung–Box tests exhibit slight size distortions, though tests based on the Rosenblatt transformation perform better than the generalized residual-based tests. The tests exhibit impressive power to detect both autocorrelation and autoregressive conditional heteroscedasticity (ARCH). The tests are illustrated with a Markov-switching generalized ARCH (GARCH) model fitted to the US dollar–British pound exchange rate, with the finding that both autocorrelation and GARCH effects are needed to adequately fit the data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470–481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern statistical models and computational methods can now incorporate uncertainty of the parameters used in Quantitative Microbial Risk Assessments (QMRA). Many QMRAs use Monte Carlo methods, but work from fixed estimates for means, variances and other parameters. We illustrate the ease of estimating all parameters contemporaneously with the risk assessment, incorporating all the parameter uncertainty arising from the experiments from which these parameters are estimated. A Bayesian approach is adopted, using Markov Chain Monte Carlo Gibbs sampling (MCMC) via the freely available software, WinBUGS. The method and its ease of implementation are illustrated by a case study that involves incorporating three disparate datasets into an MCMC framework. The probabilities of infection when the uncertainty associated with parameter estimation is incorporated into a QMRA are shown to be considerably more variable over various dose ranges than the analogous probabilities obtained when constants from the literature are simply ‘plugged’ in as is done in most QMRAs. Neglecting these sources of uncertainty may lead to erroneous decisions for public health and risk management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the limitation of current condition monitoring technologies, the estimates of asset health states may contain some uncertainties. A maintenance strategy ignoring this uncertainty of asset health state can cause additional costs or downtime. The partially observable Markov decision process (POMDP) is a commonly used approach to derive optimal maintenance strategies when asset health inspections are imperfect. However, existing applications of the POMDP to maintenance decision-making largely adopt the discrete time and state assumptions. The discrete-time assumption requires the health state transitions and maintenance activities only happen at discrete epochs, which cannot model the failure time accurately and is not cost-effective. The discrete health state assumption, on the other hand, may not be elaborate enough to improve the effectiveness of maintenance. To address these limitations, this paper proposes a continuous state partially observable semi-Markov decision process (POSMDP). An algorithm that combines the Monte Carlo-based density projection method and the policy iteration is developed to solve the POSMDP. Different types of maintenance activities (i.e., inspections, replacement, and imperfect maintenance) are considered in this paper. The next maintenance action and the corresponding waiting durations are optimized jointly to minimize the long-run expected cost per unit time and availability. The result of simulation studies shows that the proposed maintenance optimization approach is more cost-effective than maintenance strategies derived by another two approximate methods, when regular inspection intervals are adopted. The simulation study also shows that the maintenance cost can be further reduced by developing maintenance strategies with state-dependent maintenance intervals using the POSMDP. In addition, during the simulation studies the proposed POSMDP shows the ability to adopt a cost-effective strategy structure when multiple types of maintenance activities are involved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of how to efficiently and safely design dose finding studies. Both current and novel utility functions are explored using Bayesian adaptive design methodology for the estimation of a maximum tolerated dose (MTD). In particular, we explore widely adopted approaches such as the continual reassessment method and minimizing the variance of the estimate of an MTD. New utility functions are constructed in the Bayesian framework and are evaluated against current approaches. To reduce computing time, importance sampling is implemented to re-weight posterior samples thus avoiding the need to draw samples using Markov chain Monte Carlo techniques. Further, as such studies are generally first-in-man, the safety of patients is paramount. We therefore explore methods for the incorporation of safety considerations into utility functions to ensure that only safe and well-predicted doses are administered. The amalgamation of Bayesian methodology, adaptive design and compound utility functions is termed adaptive Bayesian compound design (ABCD). The performance of this amalgamation of methodology is investigated via the simulation of dose finding studies. The paper concludes with a discussion of results and extensions that could be included into our approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in algorithms for approximate sampling from a multivariable target function have led to solutions to challenging statistical inference problems that would otherwise not be considered by the applied scientist. Such sampling algorithms are particularly relevant to Bayesian statistics, since the target function is the posterior distribution of the unobservables given the observables. In this thesis we develop, adapt and apply Bayesian algorithms, whilst addressing substantive applied problems in biology and medicine as well as other applications. For an increasing number of high-impact research problems, the primary models of interest are often sufficiently complex that the likelihood function is computationally intractable. Rather than discard these models in favour of inferior alternatives, a class of Bayesian "likelihoodfree" techniques (often termed approximate Bayesian computation (ABC)) has emerged in the last few years, which avoids direct likelihood computation through repeated sampling of data from the model and comparing observed and simulated summary statistics. In Part I of this thesis we utilise sequential Monte Carlo (SMC) methodology to develop new algorithms for ABC that are more efficient in terms of the number of model simulations required and are almost black-box since very little algorithmic tuning is required. In addition, we address the issue of deriving appropriate summary statistics to use within ABC via a goodness-of-fit statistic and indirect inference. Another important problem in statistics is the design of experiments. That is, how one should select the values of the controllable variables in order to achieve some design goal. The presences of parameter and/or model uncertainty are computational obstacles when designing experiments but can lead to inefficient designs if not accounted for correctly. The Bayesian framework accommodates such uncertainties in a coherent way. If the amount of uncertainty is substantial, it can be of interest to perform adaptive designs in order to accrue information to make better decisions about future design points. This is of particular interest if the data can be collected sequentially. In a sense, the current posterior distribution becomes the new prior distribution for the next design decision. Part II of this thesis creates new algorithms for Bayesian sequential design to accommodate parameter and model uncertainty using SMC. The algorithms are substantially faster than previous approaches allowing the simulation properties of various design utilities to be investigated in a more timely manner. Furthermore the approach offers convenient estimation of Bayesian utilities and other quantities that are particularly relevant in the presence of model uncertainty. Finally, Part III of this thesis tackles a substantive medical problem. A neurological disorder known as motor neuron disease (MND) progressively causes motor neurons to no longer have the ability to innervate the muscle fibres, causing the muscles to eventually waste away. When this occurs the motor unit effectively ‘dies’. There is no cure for MND, and fatality often results from a lack of muscle strength to breathe. The prognosis for many forms of MND (particularly amyotrophic lateral sclerosis (ALS)) is particularly poor, with patients usually only surviving a small number of years after the initial onset of disease. Measuring the progress of diseases of the motor units, such as ALS, is a challenge for clinical neurologists. Motor unit number estimation (MUNE) is an attempt to directly assess underlying motor unit loss rather than indirect techniques such as muscle strength assessment, which generally is unable to detect progressions due to the body’s natural attempts at compensation. Part III of this thesis builds upon a previous Bayesian technique, which develops a sophisticated statistical model that takes into account physiological information about motor unit activation and various sources of uncertainties. More specifically, we develop a more reliable MUNE method by applying marginalisation over latent variables in order to improve the performance of a previously developed reversible jump Markov chain Monte Carlo sampler. We make other subtle changes to the model and algorithm to improve the robustness of the approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The selection of optimal camera configurations (camera locations, orientations etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we introduce a statistical formulation of the optimal selection of camera configurations as well as propose a Trans-Dimensional Simulated Annealing (TDSA) algorithm to effectively solve the problem. We compare our approach with a state-of-the-art method based on Binary Integer Programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than 2 alternative heuristics designed to deal with the scalability issue of BIP.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For clinical use, in electrocardiogram (ECG) signal analysis it is important to detect not only the centre of the P wave, the QRS complex and the T wave, but also the time intervals, such as the ST segment. Much research focused entirely on qrs complex detection, via methods such as wavelet transforms, spline fitting and neural networks. However, drawbacks include the false classification of a severe noise spike as a QRS complex, possibly requiring manual editing, or the omission of information contained in other regions of the ECG signal. While some attempts were made to develop algorithms to detect additional signal characteristics, such as P and T waves, the reported success rates are subject to change from person-to-person and beat-to-beat. To address this variability we propose the use of Markov-chain Monte Carlo statistical modelling to extract the key features of an ECG signal and we report on a feasibility study to investigate the utility of the approach. The modelling approach is examined with reference to a realistic computer generated ECG signal, where details such as wave morphology and noise levels are variable.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the utility to computational Bayesian analyses of a particular family of recursive marginal likelihood estimators characterized by the (equivalent) algorithms known as "biased sampling" or "reverse logistic regression" in the statistics literature and "the density of states" in physics. Through a pair of numerical examples (including mixture modeling of the well-known galaxy dataset) we highlight the remarkable diversity of sampling schemes amenable to such recursive normalization, as well as the notable efficiency of the resulting pseudo-mixture distributions for gauging prior-sensitivity in the Bayesian model selection context. Our key theoretical contributions are to introduce a novel heuristic ("thermodynamic integration via importance sampling") for qualifying the role of the bridging sequence in this procedure, and to reveal various connections between these recursive estimators and the nested sampling technique.