270 resultados para Rejection-sampling Algorithm
em Queensland University of Technology - ePrints Archive
Resumo:
We present a Bayesian sampling algorithm called adaptive importance sampling or population Monte Carlo (PMC), whose computational workload is easily parallelizable and thus has the potential to considerably reduce the wall-clock time required for sampling, along with providing other benefits. To assess the performance of the approach for cosmological problems, we use simulated and actual data consisting of CMB anisotropies, supernovae of type Ia, and weak cosmological lensing, and provide a comparison of results to those obtained using state-of-the-art Markov chain Monte Carlo (MCMC). For both types of data sets, we find comparable parameter estimates for PMC and MCMC, with the advantage of a significantly lower wall-clock time for PMC. In the case of WMAP5 data, for example, the wall-clock time scale reduces from days for MCMC to hours using PMC on a cluster of processors. Other benefits of the PMC approach, along with potential difficulties in using the approach, are analyzed and discussed.
Resumo:
Accurately quantifying total greenhouse gas emissions (e.g. methane) from natural systems such as lakes, reservoirs and wetlands requires the spatial-temporal measurement of both diffusive and ebullitive (bubbling) emissions. Traditional, manual, measurement techniques provide only limited localised assessment of methane flux, often introducing significant errors when extrapolated to the whole-of-system. In this paper, we directly address these current sampling limitations and present a novel multiple robotic boat system configured to measure the spatiotemporal release of methane to atmosphere across inland waterways. The system, consisting of multiple networked Autonomous Surface Vehicles (ASVs) and capable of persistent operation, enables scientists to remotely evaluate the performance of sampling and modelling algorithms for real-world process quantification over extended periods of time. This paper provides an overview of the multi-robot sampling system including the vehicle and gas sampling unit design. Experimental results are shown demonstrating the system’s ability to autonomously navigate and implement an exploratory sampling algorithm to measure methane emissions on two inland reservoirs.
Resumo:
Standard Monte Carlo (sMC) simulation models have been widely used in AEC industry research to address system uncertainties. Although the benefits of probabilistic simulation analyses over deterministic methods are well documented, the sMC simulation technique is quite sensitive to the probability distributions of the input variables. This phenomenon becomes highly pronounced when the region of interest within the joint probability distribution (a function of the input variables) is small. In such cases, the standard Monte Carlo approach is often impractical from a computational standpoint. In this paper, a comparative analysis of standard Monte Carlo simulation to Markov Chain Monte Carlo with subset simulation (MCMC/ss) is presented. The MCMC/ss technique constitutes a more complex simulation method (relative to sMC), wherein a structured sampling algorithm is employed in place of completely randomized sampling. Consequently, gains in computational efficiency can be made. The two simulation methods are compared via theoretical case studies.
Resumo:
Water to air methane emissions from freshwater reservoirs can be dominated by sediment bubbling (ebullitive) events. Previous work to quantify methane bubbling from a number of Australian sub-tropical reservoirs has shown that this can contribute as much as 95% of total emissions. These bubbling events are controlled by a variety of different factors including water depth, surface and internal waves, wind seiching, atmospheric pressure changes and water levels changes. Key to quantifying the magnitude of this emission pathway is estimating both the bubbling rate as well as the areal extent of bubbling. Both bubbling rate and areal extent are seldom constant and require persistent monitoring over extended time periods before true estimates can be generated. In this paper we present a novel system for persistent monitoring of both bubbling rate and areal extent using multiple robotic surface chambers and adaptive sampling (grazing) algorithms to automate the quantification process. Individual chambers are self-propelled and guided and communicate between each other without the need for supervised control. They can maintain station at a sampling site for a desired incubation period and continuously monitor, record and report fluxes during the incubation. To exploit the methane sensor detection capabilities, the chamber can be automatically lowered to decrease the head-space and increase concentration. The grazing algorithms assign a hierarchical order to chambers within a preselected zone. Chambers then converge on the individual recording the highest 15 minute bubbling rate. Individuals maintain a specified distance apart from each other during each sampling period before all individuals are then required to move to different locations based on a sampling algorithm (systematic or adaptive) exploiting prior measurements. This system has been field tested on a large-scale subtropical reservoir, Little Nerang Dam, and over monthly timescales. Using this technique, localised bubbling zones on the water storage were found to produce over 50,000 mg m-2 d-1 and the areal extent ranged from 1.8 to 7% of the total reservoir area. The drivers behind these changes as well as lessons learnt from the system implementation are presented. This system exploits relatively cheap materials, sensing and computing and can be applied to a wide variety of aquatic and terrestrial systems.
Resumo:
In this paper it is demonstrated how the Bayesian parametric bootstrap can be adapted to models with intractable likelihoods. The approach is most appealing when the semi-automatic approximate Bayesian computation (ABC) summary statistics are selected. After a pilot run of ABC, the likelihood-free parametric bootstrap approach requires very few model simulations to produce an approximate posterior, which can be a useful approximation in its own right. An alternative is to use this approximation as a proposal distribution in ABC algorithms to make them more efficient. In this paper, the parametric bootstrap approximation is used to form the initial importance distribution for the sequential Monte Carlo and the ABC importance and rejection sampling algorithms. The new approach is illustrated through a simulation study of the univariate g-and- k quantile distribution, and is used to infer parameter values of a stochastic model describing expanding melanoma cell colonies.
Resumo:
The population Monte Carlo algorithm is an iterative importance sampling scheme for solving static problems. We examine the population Monte Carlo algorithm in a simplified setting, a single step of the general algorithm, and study a fundamental problem that occurs in applying importance sampling to high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of estimate under conditions on the importance function. We demonstrate the exponential growth of the asymptotic variance with the dimension and show that the optimal covariance matrix for the importance function can be estimated in special cases.
Resumo:
This paper presents advanced optimization techniques for Mission Path Planning (MPP) of a UAS fitted with a spore trap to detect and monitor spores and plant pathogens. The UAV MPP aims to optimise the mission path planning search and monitoring of spores and plant pathogens that may allow the agricultural sector to be more competitive and more reliable. The UAV will be fitted with an air sampling or spore trap to detect and monitor spores and plant pathogens in remote areas not accessible to current stationary monitor methods. The optimal paths are computed using a Multi-Objective Evolutionary Algorithms (MOEAs). Two types of multi-objective optimisers are compared; the MOEA Non-dominated Sorting Genetic Algorithms II (NSGA-II) and Hybrid Game are implemented to produce a set of optimal collision-free trajectories in three-dimensional environment. The trajectories on a three-dimension terrain, which are generated off-line, are collision-free and are represented by using Bézier spline curves from start position to target and then target to start position or different position with altitude constraints. The efficiency of the two optimization methods is compared in terms of computational cost and design quality. Numerical results show the benefits of coupling a Hybrid-Game strategy to a MOEA for MPP tasks. The reduction of numerical cost is an important point as the faster the algorithm converges the better the algorithms is for an off-line design and for future on-line decisions of the UAV.
Resumo:
This thesis addresses computational challenges arising from Bayesian analysis of complex real-world problems. Many of the models and algorithms designed for such analysis are ‘hybrid’ in nature, in that they are a composition of components for which their individual properties may be easily described but the performance of the model or algorithm as a whole is less well understood. The aim of this research project is to after a better understanding of the performance of hybrid models and algorithms. The goal of this thesis is to analyse the computational aspects of hybrid models and hybrid algorithms in the Bayesian context. The first objective of the research focuses on computational aspects of hybrid models, notably a continuous finite mixture of t-distributions. In the mixture model, an inference of interest is the number of components, as this may relate to both the quality of model fit to data and the computational workload. The analysis of t-mixtures using Markov chain Monte Carlo (MCMC) is described and the model is compared to the Normal case based on the goodness of fit. Through simulation studies, it is demonstrated that the t-mixture model can be more flexible and more parsimonious in terms of number of components, particularly for skewed and heavytailed data. The study also reveals important computational issues associated with the use of t-mixtures, which have not been adequately considered in the literature. The second objective of the research focuses on computational aspects of hybrid algorithms for Bayesian analysis. Two approaches will be considered: a formal comparison of the performance of a range of hybrid algorithms and a theoretical investigation of the performance of one of these algorithms in high dimensions. For the first approach, the delayed rejection algorithm, the pinball sampler, the Metropolis adjusted Langevin algorithm, and the hybrid version of the population Monte Carlo (PMC) algorithm are selected as a set of examples of hybrid algorithms. Statistical literature shows how statistical efficiency is often the only criteria for an efficient algorithm. In this thesis the algorithms are also considered and compared from a more practical perspective. This extends to the study of how individual algorithms contribute to the overall efficiency of hybrid algorithms, and highlights weaknesses that may be introduced by the combination process of these components in a single algorithm. The second approach to considering computational aspects of hybrid algorithms involves an investigation of the performance of the PMC in high dimensions. It is well known that as a model becomes more complex, computation may become increasingly difficult in real time. In particular the importance sampling based algorithms, including the PMC, are known to be unstable in high dimensions. This thesis examines the PMC algorithm in a simplified setting, a single step of the general sampling, and explores a fundamental problem that occurs in applying importance sampling to a high-dimensional problem. The precision of the computed estimate from the simplified setting is measured by the asymptotic variance of the estimate under conditions on the importance function. Additionally, the exponential growth of the asymptotic variance with the dimension is demonstrated and we illustrates that the optimal covariance matrix for the importance function can be estimated in a special case.
Resumo:
Ocean processes are dynamic, complex, and occur on multiple spatial and temporal scales. To obtain a synoptic view of such processes, ocean scientists collect data over long time periods. Historically, measurements were continually provided by fixed sensors, e.g., moorings, or gathered from ships. Recently, an increase in the utilization of autonomous underwater vehicles has enabled a more dynamic data acquisition approach. However, we still do not utilize the full capabilities of these vehicles. Here we present algorithms that produce persistent monitoring missions for underwater vehicles by balancing path following accuracy and sampling resolution for a given region of interest, which addresses a pressing need among ocean scientists to efficiently and effectively collect high-value data. More specifically, this paper proposes a path planning algorithm and a speed control algorithm for underwater gliders, which together give informative trajectories for the glider to persistently monitor a patch of ocean. We optimize a cost function that blends two competing factors: maximize the information value along the path, while minimizing deviation from the planned path due to ocean currents. Speed is controlled along the planned path by adjusting the pitch angle of the underwater glider, so that higher resolution samples are collected in areas of higher information value. The resulting paths are closed circuits that can be repeatedly traversed to collect long-term ocean data in dynamic environments. The algorithms were tested during sea trials on an underwater glider operating off the coast of southern California, as well as in Monterey Bay, California. The experimental results show significant improvements in data resolution and path reliability compared to previously executed sampling paths used in the respective regions.
Resumo:
Here we present a sequential Monte Carlo (SMC) algorithm that can be used for any one-at-a-time Bayesian sequential design problem in the presence of model uncertainty where discrete data are encountered. Our focus is on adaptive design for model discrimination but the methodology is applicable if one has a different design objective such as parameter estimation or prediction. An SMC algorithm is run in parallel for each model and the algorithm relies on a convenient estimator of the evidence of each model which is essentially a function of importance sampling weights. Other methods for this task such as quadrature, often used in design, suffer from the curse of dimensionality. Approximating posterior model probabilities in this way allows us to use model discrimination utility functions derived from information theory that were previously difficult to compute except for conjugate models. A major benefit of the algorithm is that it requires very little problem specific tuning. We demonstrate the methodology on three applications, including discriminating between models for decline in motor neuron numbers in patients suffering from neurological diseases such as Motor Neuron disease.
Resumo:
In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables.
Resumo:
A simple and effective down-sample algorithm, Peak-Hold-Down-Sample (PHDS) algorithm is developed in this paper to enable a rapid and efficient data transfer in remote condition monitoring applications. The algorithm is particularly useful for high frequency Condition Monitoring (CM) techniques, and for low speed machine applications since the combination of the high sampling frequency and low rotating speed will generally lead to large unwieldy data size. The effectiveness of the algorithm was evaluated and tested on four sets of data in the study. One set of the data was extracted from the condition monitoring signal of a practical industry application. Another set of data was acquired from a low speed machine test rig in the laboratory. The other two sets of data were computer simulated bearing defect signals having either a single or multiple bearing defects. The results disclose that the PHDS algorithm can substantially reduce the size of data while preserving the critical bearing defect information for all the data sets used in this work even when a large down-sample ratio was used (i.e., 500 times down-sampled). In contrast, the down-sample process using existing normal down-sample technique in signal processing eliminates the useful and critical information such as bearing defect frequencies in a signal when the same down-sample ratio was employed. Noise and artificial frequency components were also induced by the normal down-sample technique, thus limits its usefulness for machine condition monitoring applications.
Resumo:
Monitoring stream networks through time provides important ecological information. The sampling design problem is to choose locations where measurements are taken so as to maximise information gathered about physicochemical and biological variables on the stream network. This paper uses a pseudo-Bayesian approach, averaging a utility function over a prior distribution, in finding a design which maximizes the average utility. We use models for correlations of observations on the stream network that are based on stream network distances and described by moving average error models. Utility functions used reflect the needs of the experimenter, such as prediction of location values or estimation of parameters. We propose an algorithmic approach to design with the mean utility of a design estimated using Monte Carlo techniques and an exchange algorithm to search for optimal sampling designs. In particular we focus on the problem of finding an optimal design from a set of fixed designs and finding an optimal subset of a given set of sampling locations. As there are many different variables to measure, such as chemical, physical and biological measurements at each location, designs are derived from models based on different types of response variables: continuous, counts and proportions. We apply the methodology to a synthetic example and the Lake Eacham stream network on the Atherton Tablelands in Queensland, Australia. We show that the optimal designs depend very much on the choice of utility function, varying from space filling to clustered designs and mixtures of these, but given the utility function, designs are relatively robust to the type of response variable.
Resumo:
A computationally efficient sequential Monte Carlo algorithm is proposed for the sequential design of experiments for the collection of block data described by mixed effects models. The difficulty in applying a sequential Monte Carlo algorithm in such settings is the need to evaluate the observed data likelihood, which is typically intractable for all but linear Gaussian models. To overcome this difficulty, we propose to unbiasedly estimate the likelihood, and perform inference and make decisions based on an exact-approximate algorithm. Two estimates are proposed: using Quasi Monte Carlo methods and using the Laplace approximation with importance sampling. Both of these approaches can be computationally expensive, so we propose exploiting parallel computational architectures to ensure designs can be derived in a timely manner. We also extend our approach to allow for model uncertainty. This research is motivated by important pharmacological studies related to the treatment of critically ill patients.
Resumo:
Between-subject and within-subject variability is ubiquitous in biology and physiology and understanding and dealing with this is one of the biggest challenges in medicine. At the same time it is difficult to investigate this variability by experiments alone. A recent modelling and simulation approach, known as population of models (POM), allows this exploration to take place by building a mathematical model consisting of multiple parameter sets calibrated against experimental data. However, finding such sets within a high-dimensional parameter space of complex electrophysiological models is computationally challenging. By placing the POM approach within a statistical framework, we develop a novel and efficient algorithm based on sequential Monte Carlo (SMC). We compare the SMC approach with Latin hypercube sampling (LHS), a method commonly adopted in the literature for obtaining the POM, in terms of efficiency and output variability in the presence of a drug block through an in-depth investigation via the Beeler-Reuter cardiac electrophysiological model. We show improved efficiency via SMC and that it produces similar responses to LHS when making out-of-sample predictions in the presence of a simulated drug block.