377 resultados para Applied Statistics
Resumo:
Having the ability to work with complex models can be highly beneficial, but the computational cost of doing so is often large. Complex models often have intractable likelihoods, so methods that directly use the likelihood function are infeasible. In these situations, the benefits of working with likelihood-free methods become apparent. Likelihood-free methods, such as parametric Bayesian indirect likelihood that uses the likelihood of an alternative parametric auxiliary model, have been explored throughout the literature as a good alternative when the model of interest is complex. One of these methods is called the synthetic likelihood (SL), which assumes a multivariate normal approximation to the likelihood of a summary statistic of interest. This paper explores the accuracy and computational efficiency of the Bayesian version of the synthetic likelihood (BSL) approach in comparison to a competitor known as approximate Bayesian computation (ABC) and its sensitivity to its tuning parameters and assumptions. We relate BSL to pseudo-marginal methods and propose to use an alternative SL that uses an unbiased estimator of the exact working normal likelihood when the summary statistic has a multivariate normal distribution. Several applications of varying complexity are considered to illustrate the findings of this paper.
Resumo:
A spatial sampling design that uses pair-copulas is presented that aims to reduce prediction uncertainty by selecting additional sampling locations based on both the spatial configuration of existing locations and the values of the observations at those locations. The novelty of the approach arises in the use of pair-copulas to estimate uncertainty at unsampled locations. Spatial pair-copulas are able to more accurately capture spatial dependence compared to other types of spatial copula models. Additionally, unlike traditional kriging variance, uncertainty estimates from the pair-copula account for influence from measurement values and not just the configuration of observations. This feature is beneficial, for example, for more accurate identification of soil contamination zones where high contamination measurements are located near measurements of varying contamination. The proposed design methodology is applied to a soil contamination example from the Swiss Jura region. A partial redesign of the original sampling configuration demonstrates the potential of the proposed methodology.
Resumo:
Predicting temporal responses of ecosystems to disturbances associated with industrial activities is critical for their management and conservation. However, prediction of ecosystem responses is challenging due to the complexity and potential non-linearities stemming from interactions between system components and multiple environmental drivers. Prediction is particularly difficult for marine ecosystems due to their often highly variable and complex natures and large uncertainties surrounding their dynamic responses. Consequently, current management of such systems often rely on expert judgement and/or complex quantitative models that consider only a subset of the relevant ecological processes. Hence there exists an urgent need for the development of whole-of-systems predictive models to support decision and policy makers in managing complex marine systems in the context of industry based disturbances. This paper presents Dynamic Bayesian Networks (DBNs) for predicting the temporal response of a marine ecosystem to anthropogenic disturbances. The DBN provides a visual representation of the problem domain in terms of factors (parts of the ecosystem) and their relationships. These relationships are quantified via Conditional Probability Tables (CPTs), which estimate the variability and uncertainty in the distribution of each factor. The combination of qualitative visual and quantitative elements in a DBN facilitates the integration of a wide array of data, published and expert knowledge and other models. Such multiple sources are often essential as one single source of information is rarely sufficient to cover the diverse range of factors relevant to a management task. Here, a DBN model is developed for tropical, annual Halophila and temperate, persistent Amphibolis seagrass meadows to inform dredging management and help meet environmental guidelines. Specifically, the impacts of capital (e.g. new port development) and maintenance (e.g. maintaining channel depths in established ports) dredging is evaluated with respect to the risk of permanent loss, defined as no recovery within 5 years (Environmental Protection Agency guidelines). The model is developed using expert knowledge, existing literature, statistical models of environmental light, and experimental data. The model is then demonstrated in a case study through the analysis of a variety of dredging, environmental and seagrass ecosystem recovery scenarios. In spatial zones significantly affected by dredging, such as the zone of moderate impact, shoot density has a very high probability of being driven to zero by capital dredging due to the duration of such dredging. Here, fast growing Halophila species can recover, however, the probability of recovery depends on the presence of seed banks. On the other hand, slow growing Amphibolis meadows have a high probability of suffering permanent loss. However, in the maintenance dredging scenario, due to the shorter duration of dredging, Amphibolis is better able to resist the impacts of dredging. For both types of seagrass meadows, the probability of loss was strongly dependent on the biological and ecological status of the meadow, as well as environmental conditions post-dredging. The ability to predict the ecosystem response under cumulative, non-linear interactions across a complex ecosystem highlights the utility of DBNs for decision support and environmental management.
Resumo:
In this chapter we consider biosecurity surveillance as part of a complex system comprising many different biological, environmental and human factors and their interactions. Modelling and analysis of surveillance strategies should take into account these complexities, and also facilitate the use and integration of the many types of different information that can provide insight into the system as a whole. After a brief discussion of a range of options, we focus on Bayesian networks for representing such complex systems. We summarize the features of Bayesian networks and describe these in the context of surveillance.
Resumo:
The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a ‘magnitude-based inference’ approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.
Resumo:
This article presents a methodology that integrates cumulative plots with probe vehicle data for estimation of travel time statistics (average, quartile) on urban networks. The integration reduces relative deviation among the cumulative plots so that the classical analytical procedure of defining the area between the plots as the total travel time can be applied. For quartile estimation, a slicing technique is proposed. The methodology is validated with real data from Lucerne, Switzerland and it is concluded that the travel time estimates from the proposed methodology are statistically equivalent to the observed values.
Resumo:
Purpose – The purpose of this paper is to summarise a successfully defended doctoral thesis. The main purpose of this paper is to provide a summary of the scope, and main issues raised in the thesis so that readers undertaking studies in the same or connected areas may be aware of current contributions to the topic. The secondary aims are to frame the completed thesis in the context of doctoral-level research in project management as well as offer ideas for further investigation which would serve to extend scientific knowledge on the topic. Design/methodology/approach – Research reported in this paper is based on a quantitative study using inferential statistics aimed at better understanding the actual and potential usage of earned value management (EVM) as applied to external projects under contract. Theories uncovered during the literature review were hypothesized and tested using experiential data collected from 145 EVM practitioners with direct experience on one or more external projects under contract that applied the methodology. Findings – The results of this research suggest that EVM is an effective project management methodology. The principles of EVM were shown to be significant positive predictors of project success on contracted efforts and to be a relatively greater positive predictor of project success when using fixed-price versus cost-plus (CP) type contracts. Moreover, EVM's work-breakdown structure (WBS) utility was shown to positively contribute to the formation of project contracts. The contribution was not significantly different between fixed-price and CP contracted projects, with exceptions in the areas of schedule planning and payment planning. EVM's “S” curve benefited the administration of project contracts. The contribution of the S-curve was not significantly different between fixed-price and CP contracted projects. Furthermore, EVM metrics were shown to also be important contributors to the administration of project contracts. The relative contribution of EVM metrics to projects under fixed-price versus CP contracts was not significantly different, with one exception in the area of evaluating and processing payment requests. Practical implications – These results have important implications for project practitioners, EVM advocates, as well as corporate and governmental policy makers. EVM should be considered for all projects – not only for its positive contribution to project contract development and administration, for its contribution to project success as well, regardless of contract type. Contract type should not be the sole determining factor in the decision whether or not to use EVM. More particularly, the more fixed the contracted project cost, the more the principles of EVM explain the success of the project. The use of EVM mechanics should also be used in all projects regardless of contract type. Payment planning using a WBS should be emphasized in fixed-price contracts using EVM in order to help mitigate performance risk. Schedule planning using a WBS should be emphasized in CP contracts using EVM in order to help mitigate financial risk. Similarly, EVM metrics should be emphasized in fixed-price contracts in evaluating and processing payment requests. Originality/value – This paper provides a summary of cutting-edge research work and a link to the published thesis that researchers can use to help them understand how the research methodology was applied as well as how it can be extended.
Resumo:
This paper addresses the problem of determining optimal designs for biological process models with intractable likelihoods, with the goal of parameter inference. The Bayesian approach is to choose a design that maximises the mean of a utility, and the utility is a function of the posterior distribution. Therefore, its estimation requires likelihood evaluations. However, many problems in experimental design involve models with intractable likelihoods, that is, likelihoods that are neither analytic nor can be computed in a reasonable amount of time. We propose a novel solution using indirect inference (II), a well established method in the literature, and the Markov chain Monte Carlo (MCMC) algorithm of Müller et al. (2004). Indirect inference employs an auxiliary model with a tractable likelihood in conjunction with the generative model, the assumed true model of interest, which has an intractable likelihood. Our approach is to estimate a map between the parameters of the generative and auxiliary models, using simulations from the generative model. An II posterior distribution is formed to expedite utility estimation. We also present a modification to the utility that allows the Müller algorithm to sample from a substantially sharpened utility surface, with little computational effort. Unlike competing methods, the II approach can handle complex design problems for models with intractable likelihoods on a continuous design space, with possible extension to many observations. The methodology is demonstrated using two stochastic models; a simple tractable death process used to validate the approach, and a motivating stochastic model for the population evolution of macroparasites.
Resumo:
A new method for estimating the time to colonization of Methicillin-resistant Staphylococcus Aureus (MRSA) patients is developed in this paper. The time to colonization of MRSA is modelled using a Bayesian smoothing approach for the hazard function. There are two prior models discussed in this paper: the first difference prior and the second difference prior. The second difference prior model gives smoother estimates of the hazard functions and, when applied to data from an intensive care unit (ICU), clearly shows increasing hazard up to day 13, then a decreasing hazard. The results clearly demonstrate that the hazard is not constant and provide a useful quantification of the effect of length of stay on the risk of MRSA colonization which provides useful insight.