5 resultados para Approximate Bayesian computation
em University of Queensland eSpace - Australia
Resumo:
Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm. to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker. demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.
Resumo:
Testing for simultaneous vicariance across comparative phylogeographic data sets is a notoriously difficult problem hindered by mutational variance, the coalescent variance, and variability across pairs of sister taxa in parameters that affect genetic divergence. We simulate vicariance to characterize the behaviour of several commonly used summary statistics across a range of divergence times, and to characterize this behaviour in comparative phylogeographic datasets having multiple taxon-pairs. We found Tajima's D to be relatively uncorrelated with other summary statistics across divergence times, and using simple hypothesis testing of simultaneous vicariance given variable population sizes, we counter-intuitively found that the variance across taxon pairs in Nei and Li's net nucleotide divergence (pi(net)), a common measure of population divergence, is often inferior to using the variance in Tajima's D across taxon pairs as a test statistic to distinguish ancient simultaneous vicariance from variable vicariance histories. The opposite and more intuitive pattern is found for testing more recent simultaneous vicariance, and overall we found that depending on the timing of vicariance, one of these two test statistics can achieve high statistical power for rejecting simultaneous vicariance, given a reasonable number of intron loci (> 5 loci, 400 bp) and a range of conditions. These results suggest that components of these two composite summary statistics should be used in future simulation-based methods which can simultaneously use a pool of summary statistics to test comparative the phylogeographic hypotheses we consider here.
Resumo:
In this paper we investigate a Bayesian procedure for the estimation of a flexible generalised distribution, notably the MacGillivray adaptation of the g-and-κ distribution. This distribution, described through its inverse cdf or quantile function, generalises the standard normal through extra parameters which together describe skewness and kurtosis. The standard quantile-based methods for estimating the parameters of generalised distributions are often arbitrary and do not rely on computation of the likelihood. MCMC, however, provides a simulation-based alternative for obtaining the maximum likelihood estimates of parameters of these distributions or for deriving posterior estimates of the parameters through a Bayesian framework. In this paper we adopt the latter approach, The proposed methodology is illustrated through an application in which the parameter of interest is slightly skewed.
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
Objective: It is usual that data collected from routine clinical care is sparse and unable to support the more complex pharmacokinetic (PK) models that may have been reported in previous rich data studies. Informative priors may be a pre-requisite for model development. The aim of this study was to estimate the population PK parameters of sirolimus using a fully Bayesian approach with informative priors. Methods: Informative priors including prior mean and precision of the prior mean were elicited from previous published studies using a meta-analytic technique. Precision of between-subject variability was determined by simulations from a Wishart distribution using MATLAB (version 6.5). Concentration-time data of sirolimus retrospectively collected from kidney transplant patients were analysed using WinBUGS (version 1.3). The candidate models were either one- or two-compartment with first order absorption and first order elimination. Model discrimination was based on computation of the posterior odds supporting the model. Results: A total of 315 concentration-time points were obtained from 25 patients. Most data were clustered at trough concentrations with range of 1.6 to 77 hours post-dose. Using informative priors, either a one- or two-compartment model could be used to describe the data. When a one-compartment model was applied, information was gained from the data for the value of apparent clearance (CL/F = 18.5 L/h), and apparent volume of distribution (V/F = 1406 L) but no information was gained about the absorption rate constant (ka). When a two-compartment model was fitted to the data, the data were informative about CL/F, apparent inter-compartmental clearance, and apparent volume of distribution of the peripheral compartment (13.2 L/h, 20.8 L/h, and 579 L, respectively). The posterior distribution of the volume distribution of central compartment and ka were the same as priors. The posterior odds for the two-compartment model was 8.1, indicating the data supported the two-compartment model. Conclusion: The use of informative priors supported the choice of a more complex and informative model that would otherwise have not been supported by the sparse data.