80 resultados para Statistical experimental design

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Utility functions in Bayesian experimental design are usually based on the posterior distribution. When the posterior is found by simulation, it must be sampled from for each future data set drawn from the prior predictive distribution. Many thousands of posterior distributions are often required. A popular technique in the Bayesian experimental design literature to rapidly obtain samples from the posterior is importance sampling, using the prior as the importance distribution. However, importance sampling will tend to break down if there is a reasonable number of experimental observations and/or the model parameter is high dimensional. In this paper we explore the use of Laplace approximations in the design setting to overcome this drawback. Furthermore, we consider using the Laplace approximation to form the importance distribution to obtain a more efficient importance distribution than the prior. The methodology is motivated by a pharmacokinetic study which investigates the effect of extracorporeal membrane oxygenation on the pharmacokinetics of antibiotics in sheep. The design problem is to find 10 near optimal plasma sampling times which produce precise estimates of pharmacokinetic model parameters/measures of interest. We consider several different utility functions of interest in these studies, which involve the posterior distribution of parameter functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present fully Bayesian experimental designs for nonlinear mixed effects models, in which we develop simulation-based optimal design methods to search over both continuous and discrete design spaces. Although Bayesian inference has commonly been performed on nonlinear mixed effects models, there is a lack of research into performing Bayesian optimal design for nonlinear mixed effects models that require searches to be performed over several design variables. This is likely due to the fact that it is much more computationally intensive to perform optimal experimental design for nonlinear mixed effects models than it is to perform inference in the Bayesian framework. In this paper, the design problem is to determine the optimal number of subjects and samples per subject, as well as the (near) optimal urine sampling times for a population pharmacokinetic study in horses, so that the population pharmacokinetic parameters can be precisely estimated, subject to cost constraints. The optimal sampling strategies, in terms of the number of subjects and the number of samples per subject, were found to be substantially different between the examples considered in this work, which highlights the fact that the designs are rather problem-dependent and require optimisation using the methods presented in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the problem of determining optimal designs for biological process models with intractable likelihoods, with the goal of parameter inference. The Bayesian approach is to choose a design that maximises the mean of a utility, and the utility is a function of the posterior distribution. Therefore, its estimation requires likelihood evaluations. However, many problems in experimental design involve models with intractable likelihoods, that is, likelihoods that are neither analytic nor can be computed in a reasonable amount of time. We propose a novel solution using indirect inference (II), a well established method in the literature, and the Markov chain Monte Carlo (MCMC) algorithm of Müller et al. (2004). Indirect inference employs an auxiliary model with a tractable likelihood in conjunction with the generative model, the assumed true model of interest, which has an intractable likelihood. Our approach is to estimate a map between the parameters of the generative and auxiliary models, using simulations from the generative model. An II posterior distribution is formed to expedite utility estimation. We also present a modification to the utility that allows the Müller algorithm to sample from a substantially sharpened utility surface, with little computational effort. Unlike competing methods, the II approach can handle complex design problems for models with intractable likelihoods on a continuous design space, with possible extension to many observations. The methodology is demonstrated using two stochastic models; a simple tractable death process used to validate the approach, and a motivating stochastic model for the population evolution of macroparasites.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: In health related research, it is critical not only to demonstrate the efficacy of intervention, but to show that this is not due to chance or confounding variables. Content: Single case experimental design is a useful quasi-experimental design and method used to achieve these goals when there are limited participants and funds for research. This type of design has various advantages compared to group experimental designs. One such advantage is the capacity to focus on individual performance outcomes compared to group performance outcomes. Conclusions: This comprehensive review demonstrates the benefits and limitations of using single case experimental design, its various design methods, and data collection and analysis for research purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of Bayesian methodologies for solving optimal experimental design problems has increased. Many of these methods have been found to be computationally intensive for design problems that require a large number of design points. A simulation-based approach that can be used to solve optimal design problems in which one is interested in finding a large number of (near) optimal design points for a small number of design variables is presented. The approach involves the use of lower dimensional parameterisations that consist of a few design variables, which generate multiple design points. Using this approach, one simply has to search over a few design variables, rather than searching over a large number of optimal design points, thus providing substantial computational savings. The methodologies are demonstrated on four applications, including the selection of sampling times for pharmacokinetic and heat transfer studies, and involve nonlinear models. Several Bayesian design criteria are also compared and contrasted, as well as several different lower dimensional parameterisation schemes for generating the many design points.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim. A protocol for a new peer-led self-management programme for communitydwelling older people with diabetes in Shanghai, China. Background. The increasing prevalence of type 2 diabetes poses major public health challenges. Appropriate education programmes could help people with diabetes to achieve self-management and better health outcomes. Providing education programmes to the fast growing number of people with diabetes present a real challenge to Chinese healthcare system, which is strained for personnel and funding shortages. Empirical literature and expert opinions suggest that peer education programmes are promising. Design. Quasi-experimental. Methods. This study is a non-equivalent control group design (protocol approved in January, 2008). A total of 190 people, with 95 participants in each group, will be recruited from two different, but similar, communities. The programme, based on Social Cognitive Theory, will consist of basic diabetes instruction and social support and self-efficacy enhancing group activities. Basic diabetes instruction sessions will be delivered by health professionals, whereas social support and self-efficacy enhancing group activities will be led by peer leaders. Outcome variables include: self-efficacy, social support, self-management behaviours, depressive status, quality of life and healthcare utilization, which will be measured at baseline, 4 and 12 weeks. Discussion. This theory-based programme tailored to Chinese patients has potential for improving diabetes self-management and subsequent health outcomes. In addition, the delivery mode, through involvement of peer leaders and existing community networks,is especially promising considering healthcare resource shortage in China.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deterministic computer simulations of physical experiments are now common techniques in science and engineering. Often, physical experiments are too time consuming, expensive or impossible to conduct. Complex computer models or codes, rather than physical experiments lead to the study of computer experiments, which are used to investigate many scientific phenomena of this nature. A computer experiment consists of a number of runs of the computer code with different input choices. The Design and Analysis of Computer Experiments is a rapidly growing technique in statistical experimental design. This thesis investigates some practical issues in the design and analysis of computer experiments and attempts to answer some of the questions faced by experimenters using computer experiments. In particular, the question of the number of computer experiments and how they should be augmented is studied and attention is given to when the response is a function over time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis progresses Bayesian experimental design by developing novel methodologies and extensions to existing algorithms. Through these advancements, this thesis provides solutions to several important and complex experimental design problems, many of which have applications in biology and medicine. This thesis consists of a series of published and submitted papers. In the first paper, we provide a comprehensive literature review on Bayesian design. In the second paper, we discuss methods which may be used to solve design problems in which one is interested in finding a large number of (near) optimal design points. The third paper presents methods for finding fully Bayesian experimental designs for nonlinear mixed effects models, and the fourth paper investigates methods to rapidly approximate the posterior distribution for use in Bayesian utility functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Big Datasets are endemic, but they are often notoriously difficult to analyse because of their size, heterogeneity, history and quality. The purpose of this paper is to open a discourse on the use of modern experimental design methods to analyse Big Data in order to answer particular questions of interest. By appealing to a range of examples, it is suggested that this perspective on Big Data modelling and analysis has wide generality and advantageous inferential and computational properties. In particular, the principled experimental design approach is shown to provide a flexible framework for analysis that, for certain classes of objectives and utility functions, delivers near equivalent answers compared with analyses of the full dataset under a controlled error rate. It can also provide a formalised method for iterative parameter estimation, model checking, identification of data gaps and evaluation of data quality. Finally, it has the potential to add value to other Big Data sampling algorithms, in particular divide-and-conquer strategies, by determining efficient sub-samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The climate in the Arctic is changing faster than anywhere else on earth. Poorly understood feedback processes relating to Arctic clouds and aerosol–cloud interactions contribute to a poor understanding of the present changes in the Arctic climate system, and also to a large spread in projections of future climate in the Arctic. The problem is exacerbated by the paucity of research-quality observations in the central Arctic. Improved formulations in climate models require such observations, which can only come from measurements in situ in this difficult-to-reach region with logistically demanding environmental conditions. The Arctic Summer Cloud Ocean Study (ASCOS) was the most extensive central Arctic Ocean expedition with an atmospheric focus during the International Polar Year (IPY) 2007–2008. ASCOS focused on the study of the formation and life cycle of low-level Arctic clouds. ASCOS departed from Longyearbyen on Svalbard on 2 August and returned on 9 September 2008. In transit into and out of the pack ice, four short research stations were undertaken in the Fram Strait: two in open water and two in the marginal ice zone. After traversing the pack ice northward, an ice camp was set up on 12 August at 87°21' N, 01°29' W and remained in operation through 1 September, drifting with the ice. During this time, extensive measurements were taken of atmospheric gas and particle chemistry and physics, mesoscale and boundary-layer meteorology, marine biology and chemistry, and upper ocean physics. ASCOS provides a unique interdisciplinary data set for development and testing of new hypotheses on cloud processes, their interactions with the sea ice and ocean and associated physical, chemical, and biological processes and interactions. For example, the first-ever quantitative observation of bubbles in Arctic leads, combined with the unique discovery of marine organic material, polymer gels with an origin in the ocean, inside cloud droplets suggests the possibility of primary marine organically derived cloud condensation nuclei in Arctic stratocumulus clouds. Direct observations of surface fluxes of aerosols could, however, not explain observed variability in aerosol concentrations, and the balance between local and remote aerosols sources remains open. Lack of cloud condensation nuclei (CCN) was at times a controlling factor in low-level cloud formation, and hence for the impact of clouds on the surface energy budget. ASCOS provided detailed measurements of the surface energy balance from late summer melt into the initial autumn freeze-up, and documented the effects of clouds and storms on the surface energy balance during this transition. In addition to such process-level studies, the unique, independent ASCOS data set can and is being used for validation of satellite retrievals, operational models, and reanalysis data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Deterministic computer simulation of physical experiments is now a common technique in science and engineering. Often, physical experiments are too time consuming, expensive or impossible to conduct. Complex computer models or codes, rather than physical experiments lead to the study of computer experiments, which are used to investigate many scientific phenomena. A computer experiment consists of a number of runs of the computer code with different input choices. The Design and Analysis of Computer Experiments is a rapidly growing technique in statistical experimental design. This paper aims to discuss some practical issues when designing a computer simulation and/or experiments for manufacturing systems. A case study approach is reviewed and presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bayesian experimental design is a fast growing area of research with many real-world applications. As computational power has increased over the years, so has the development of simulation-based design methods, which involve a number of algorithms, such as Markov chain Monte Carlo, sequential Monte Carlo and approximate Bayes methods, facilitating more complex design problems to be solved. The Bayesian framework provides a unified approach for incorporating prior information and/or uncertainties regarding the statistical model with a utility function which describes the experimental aims. In this paper, we provide a general overview on the concepts involved in Bayesian experimental design, and focus on describing some of the more commonly used Bayesian utility functions and methods for their estimation, as well as a number of algorithms that are used to search over the design space to find the Bayesian optimal design. We also discuss other computational strategies for further research in Bayesian optimal design.