166 resultados para Population Monte Carlo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new transdimensional Sequential Monte Carlo (SMC) algorithm called SM- CVB is proposed. In an SMC approach, a weighted sample of particles is generated from a sequence of probability distributions which ‘converge’ to the target distribution of interest, in this case a Bayesian posterior distri- bution. The approach is based on the use of variational Bayes to propose new particles at each iteration of the SMCVB algorithm in order to target the posterior more efficiently. The variational-Bayes-generated proposals are not limited to a fixed dimension. This means that the weighted particle sets that arise can have varying dimensions thereby allowing us the option to also estimate an appropriate dimension for the model. This novel algorithm is outlined within the context of finite mixture model estimation. This pro- vides a less computationally demanding alternative to using reversible jump Markov chain Monte Carlo kernels within an SMC approach. We illustrate these ideas in a simulated data analysis and in applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte-Carlo Tree Search (MCTS) is a heuristic to search in large trees. We apply it to argumentative puzzles where MCTS pursues the best argumentation with respect to a set of arguments to be argued. To make our ideas as widely applicable as possible, we integrate MCTS to an abstract setting for argumentation where the content of arguments is left unspecified. Experimental results show the pertinence of this integration for learning argumentations by comparing it with a basic reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When a puzzle game is created, its design parameters must be chosen to allow solvable and interesting challenges to be created for the player. We investigate the use of random sampling as a computationally inexpensive means of automated game analysis, to evaluate the BoxOff family of puzzle games. This analysis reveals useful insights into the game, such as the surprising fact that almost 100% of randomly generated challenges have a solution, but less than 10% will be solved using strictly random play, validating the inventor’s design choices. We show the 1D game to be trivial and the 3D game to be viable.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the rapid development of various technologies and applications in smart grid implementation, demand response has attracted growing research interests because of its potentials in enhancing power grid reliability with reduced system operation costs. This paper presents a new demand response model with elastic economic dispatch in a locational marginal pricing market. It models system economic dispatch as a feedback control process, and introduces a flexible and adjustable load cost as a controlled signal to adjust demand response. Compared with the conventional “one time use” static load dispatch model, this dynamic feedback demand response model may adjust the load to a desired level in a finite number of time steps and a proof of convergence is provided. In addition, Monte Carlo simulation and boundary calculation using interval mathematics are applied for describing uncertainty of end-user's response to an independent system operator's expected dispatch. A numerical analysis based on the modified Pennsylvania-Jersey-Maryland power pool five-bus system is introduced for simulation and the results verify the effectiveness of the proposed model. System operators may use the proposed model to obtain insights in demand response processes for their decision-making regarding system load levels and operation conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses computational challenges arising from Bayesian analysis of complex real-world problems. Many of the models and algorithms designed for such analysis are ‘hybrid’ in nature, in that they are a composition of components for which their individual properties may be easily described but the performance of the model or algorithm as a whole is less well understood. The aim of this research project is to after a better understanding of the performance of hybrid models and algorithms. The goal of this thesis is to analyse the computational aspects of hybrid models and hybrid algorithms in the Bayesian context. The first objective of the research focuses on computational aspects of hybrid models, notably a continuous finite mixture of t-distributions. In the mixture model, an inference of interest is the number of components, as this may relate to both the quality of model fit to data and the computational workload. The analysis of t-mixtures using Markov chain Monte Carlo (MCMC) is described and the model is compared to the Normal case based on the goodness of fit. Through simulation studies, it is demonstrated that the t-mixture model can be more flexible and more parsimonious in terms of number of components, particularly for skewed and heavytailed data. The study also reveals important computational issues associated with the use of t-mixtures, which have not been adequately considered in the literature. The second objective of the research focuses on computational aspects of hybrid algorithms for Bayesian analysis. Two approaches will be considered: a formal comparison of the performance of a range of hybrid algorithms and a theoretical investigation of the performance of one of these algorithms in high dimensions. For the first approach, the delayed rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin algorithm, and the hybrid version of the population Monte Carlo (PMC) algorithm are selected as a set of examples of hybrid algorithms. Statistical literature shows how statistical efficiency is often the only criteria for an efficient algorithm. In this thesis the algorithms are also considered and compared from a more practical perspective. This extends to the study of how individual algorithms contribute to the overall efficiency of hybrid algorithms, and highlights weaknesses that may be introduced by the combination process of these components in a single algorithm. The second approach to considering computational aspects of hybrid algorithms involves an investigation of the performance of the PMC in high dimensions. It is well known that as a model becomes more complex, computation may become increasingly difficult in real time. In particular the importance sampling based algorithms, including the PMC, are known to be unstable in high dimensions. This thesis examines the PMC algorithm in a simplified setting, a single step of the general sampling, and explores a fundamental problem that occurs in applying importance sampling to a high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of the estimate under conditions on the importance function. Additionally, the exponential growth of the asymptotic variance with the dimension is demonstrated and we illustrates that the optimal covariance matrix for the importance function can be estimated in a special case.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a Bayesian sampling algorithm called adaptive importance sampling or population Monte Carlo (PMC), whose computational workload is easily parallelizable and thus has the potential to considerably reduce the wall-clock time required for sampling, along with providing other benefits. To assess the performance of the approach for cosmological problems, we use simulated and actual data consisting of CMB anisotropies, supernovae of type Ia, and weak cosmological lensing, and provide a comparison of results to those obtained using state-of-the-art Markov chain Monte Carlo (MCMC). For both types of data sets, we find comparable parameter estimates for PMC and MCMC, with the advantage of a significantly lower wall-clock time for PMC. In the case of WMAP5 data, for example, the wall-clock time scale reduces from days for MCMC to hours using PMC on a cluster of processors. Other benefits of the PMC approach, along with potential difficulties in using the approach, are analyzed and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We estimate the parameters of a stochastic process model for a macroparasite population within a host using approximate Bayesian computation (ABC). The immunity of the host is an unobserved model variable and only mature macroparasites at sacrifice of the host are counted. With very limited data, process rates are inferred reasonably precisely. Modeling involves a three variable Markov process for which the observed data likelihood is computationally intractable. ABC methods are particularly useful when the likelihood is analytically or computationally intractable. The ABC algorithm we present is based on sequential Monte Carlo, is adaptive in nature, and overcomes some drawbacks of previous approaches to ABC. The algorithm is validated on a test example involving simulated data from an autologistic model before being used to infer parameters of the Markov process model for experimental data. The fitted model explains the observed extra-binomial variation in terms of a zero-one immunity variable, which has a short-lived presence in the host.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exposure to ultrafine particles (diameter less than 100 nm) is an important topic in epidemiological and toxicological studies. This study used the average particle number size distribution data obtained from our measurement survey in major micro-environments, together with the people activity pattern data obtained from the Italian Human Activity Pattern Survey to estimate the tracheobronchial and alveolar dose of submicrometer particles for different population age groups in Italy. We developed a numerical methodology based on Monte Carlo method, in order to estimate the best combination from a probabilistic point of view. More than 106 different cases were analyzed according to a purpose built sub-routine and our results showed that the daily alveolar particle number and surface area deposited for all of the age groups considered was equal to 1.5 x 1011 particles and 2.5 x 1015 m2, respectively, varying slightly for males and females living in Northern or Southern Italy. In terms of tracheobronchial deposition, the corresponding values for daily particle number and surface area for all age groups was equal to 6.5 x 1010 particles and 9.9 x 1014 m2, respectively. Overall, the highest contributions were found to come from indoor cooking (female), working time (male) and transportation (i.e. traffic derived particles) (children).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple Sclerosis (MS) is a chronic neurological disease characterized by demyelination associated with infiltrating white blood cells in the central nervous system (CNS). Nitric oxide synthases (NOS) are a family of enzymes that control the production of nitric oxide. It is possible that neuronal NOS could be involved in MS pathophysiology and hence the nNOS gene is a potential candidate for involvement in disease susceptibility. The aim of this study was to determine whether allelic variation at the nNOS gene locus is associated with MS in an Australian cohort. DNA samples obtained from a Caucasian Australian population affected with MS and an unaffected control population, matched for gender, age and ethnicity, were genotyped for a microsatellite polymorphism in the promoter region of the nNOS gene. Allele frequencies were compared using chi-squared based statistical analyses with significance tested by Monte Carlo simulation. Allelic analysis of MS cases and controls produced a chi-squared value of 5.63 with simulated P = 0.96 (OR(max) = 1.41, 95% CI: 0.926-2.15). Similarly, a Mann-Whitney U analysis gave a non-significant P-value of 0.377 for allele distribution. No differences in allele frequencies were observed for gender or clinical course subtype (P > 0.05). Statistical analysis indicated that there is no association of this nNOS variant and MS and hence the gene does not appear to play a genetically significant role in disease susceptibility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Airborne particles, including both ultrafine and supermicrometric particles, contain various carcinogens. Exposure and risk-assessment studies regularly use particle mass concentration as dosimetry parameter, therefore neglecting the potential impact of ultrafine particles due to their negligible mass compared to supermicrometric particles. The main purpose of this study was the characterization of lung cancer risk due to exposure to polycyclic aromatic hydrocarbons and some heavy metals associated with particle inhalation by Italian non-smoking people. A risk-assessment scheme, modified from an existing risk model, was applied to estimate the cancer risk contribution from both ultrafine and supermicrometric particles. Exposure assessment was carried out on the basis of particle number distributions measured in 25 smoke-free microenvironments in Italy. The predicted lung cancer risk was then compared to the cancer incidence rate in Italy to assess the number of lung cancer cases attributed to airborne particle inhalation, which represents one of the main causes of lung cancer, apart from smoking. Ultrafine particles are associated with a much higher risk than supermicrometric particles, and the modified risk-assessment scheme provided a more accurate estimate than the conventional scheme. Great attention has to be paid to indoor microenvironments and, in particular, to cooking and eating times, which represent the major contributors to lung cancer incidence in the Italian population. The modified risk assessment scheme can serve as a tool for assessing environmental quality, as well as setting up exposure standards for particulate matter.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470–481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms by using indirect inference. ABC methods are useful for posterior inference in the presence of an intractable likelihood function. In the indirect inference approach to ABC the parameters of an auxiliary model fitted to the data become the summary statistics. Although applicable to any ABC technique, we embed this approach within a sequential Monte Carlo algorithm that is completely adaptive and requires very little tuning. This methodological development was motivated by an application involving data on macroparasite population evolution modelled by a trivariate stochastic process for which there is no tractable likelihood function. The auxiliary model here is based on a beta–binomial distribution. The main objective of the analysis is to determine which parameters of the stochastic model are estimable from the observed data on mature parasite worms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.