40 resultados para Approximate Bayesian computation, Posterior distribution, Quantile distribution, Response time data

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm. to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker. demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Testing for simultaneous vicariance across comparative phylogeographic data sets is a notoriously difficult problem hindered by mutational variance, the coalescent variance, and variability across pairs of sister taxa in parameters that affect genetic divergence. We simulate vicariance to characterize the behaviour of several commonly used summary statistics across a range of divergence times, and to characterize this behaviour in comparative phylogeographic datasets having multiple taxon-pairs. We found Tajima's D to be relatively uncorrelated with other summary statistics across divergence times, and using simple hypothesis testing of simultaneous vicariance given variable population sizes, we counter-intuitively found that the variance across taxon pairs in Nei and Li's net nucleotide divergence (pi(net)), a common measure of population divergence, is often inferior to using the variance in Tajima's D across taxon pairs as a test statistic to distinguish ancient simultaneous vicariance from variable vicariance histories. The opposite and more intuitive pattern is found for testing more recent simultaneous vicariance, and overall we found that depending on the timing of vicariance, one of these two test statistics can achieve high statistical power for rejecting simultaneous vicariance, given a reasonable number of intron loci (> 5 loci, 400 bp) and a range of conditions. These results suggest that components of these two composite summary statistics should be used in future simulation-based methods which can simultaneously use a pool of summary statistics to test comparative the phylogeographic hypotheses we consider here.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: It is usual that data collected from routine clinical care is sparse and unable to support the more complex pharmacokinetic (PK) models that may have been reported in previous rich data studies. Informative priors may be a pre-requisite for model development. The aim of this study was to estimate the population PK parameters of sirolimus using a fully Bayesian approach with informative priors. Methods: Informative priors including prior mean and precision of the prior mean were elicited from previous published studies using a meta-analytic technique. Precision of between-subject variability was determined by simulations from a Wishart distribution using MATLAB (version 6.5). Concentration-time data of sirolimus retrospectively collected from kidney transplant patients were analysed using WinBUGS (version 1.3). The candidate models were either one- or two-compartment with first order absorption and first order elimination. Model discrimination was based on computation of the posterior odds supporting the model. Results: A total of 315 concentration-time points were obtained from 25 patients. Most data were clustered at trough concentrations with range of 1.6 to 77 hours post-dose. Using informative priors, either a one- or two-compartment model could be used to describe the data. When a one-compartment model was applied, information was gained from the data for the value of apparent clearance (CL/F = 18.5 L/h), and apparent volume of distribution (V/F = 1406 L) but no information was gained about the absorption rate constant (ka). When a two-compartment model was fitted to the data, the data were informative about CL/F, apparent inter-compartmental clearance, and apparent volume of distribution of the peripheral compartment (13.2 L/h, 20.8 L/h, and 579 L, respectively). The posterior distribution of the volume distribution of central compartment and ka were the same as priors. The posterior odds for the two-compartment model was 8.1, indicating the data supported the two-compartment model. Conclusion: The use of informative priors supported the choice of a more complex and informative model that would otherwise have not been supported by the sparse data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this report is to describe the use of WinBUGS for two datasets that arise from typical population pharmacokinetic studies. The first dataset relates to gentamicin concentration-time data that arose as part of routine clinical care of 55 neonates. The second dataset incorporated data from 96 patients receiving enoxaparin. Both datasets were originally analyzed by using NONMEM. In the first instance, although NONMEM provided reasonable estimates of the fixed effects parameters it was unable to provide satisfactory estimates of the between-subject variance. In the second instance, the use of NONMEM resulted in the development of a successful model, albeit with limited available information on the between-subject variability of the pharmacokinetic parameters. WinBUGS was used to develop a model for both of these datasets. Model comparison for the enoxaparin dataset was performed by using the posterior distribution of the log-likelihood and a posterior predictive check. The use of WinBUGS supported the same structural models tried in NONMEM. For the gentamicin dataset a one-compartment model with intravenous infusion was developed, and the population parameters including the full between-subject variance-covariance matrix were available. Analysis of the enoxaparin dataset supported a two compartment model as superior to the one-compartment model, based on the posterior predictive check. Again, the full between-subject variance-covariance matrix parameters were available. Fully Bayesian approaches using MCMC methods, via WinBUGS, can offer added value for analysis of population pharmacokinetic data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a metachain to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely. [Bayesian phylogenetic inference; heating parameter; Markov chain Monte Carlo; replicated chains.]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The tissue distribution kinetics of a highly bound solute, propranolol, was investigated in a heterogeneous organ, the isolated perfused limb, using the impulse-response technique and destructive sampling. The propranolol concentration in muscle, skin, and fat as well as in outflow perfusate was measured up to 30 min after injection. The resulting data were analysed assuming (1) vascular, muscle, skin and fat compartments as well mixed (compartmental model) and (2) using a distributed-in-space model which accounts for the noninstantaneous intravascular mixing and tissue distribution processes but consists only of a vascular and extravascular phase (two-phase model). The compartmental model adequately described propranolol concentration-time data in the three tissue compartments and the outflow concentration-time curve (except of the early mixing phase). In contrast, the two-phase model better described the outflow concentration-time curve but is limited in accounting only for the distribution kinetics in the dominant tissue, the muscle. The two-phase model well described the time course of propranolol concentration in muscle tissue, with parameter estimates similar to those obtained with the compartmental model. The results suggest, first that the uptake kinetics of propranolol into skin and fat cannot be analysed on the basis of outflow data alone and, second that the assumption of well-mixed compartments is a valid approximation from a practical point of view las, e.g., in physiological based pharmacokinetic modelling). The steady-state distribution volumes of skin and fat were only 16 and 4%, respectively, of that of muscle tissue (16.7 ml), with higher partition coefficient in fat (6.36) than in skin (2.64) and muscle (2.79. (C) 2000 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many online applications, we need to maintain quantile statistics for a sliding window on a data stream. The sliding windows in natural form are defined as the most recent N data items. In this paper, we study the problem of estimating quantiles over other types of sliding windows. We present a uniform framework to process quantile queries for time constrained and filter based sliding windows. Our algorithm makes one pass on the data stream and maintains an E-approximate summary. It uses O((1)/(epsilon2) log(2) epsilonN) space where N is the number of data items in the window. We extend this framework to further process generalized constrained sliding window queries and proved that our technique is applicable for flexible window settings. Our performance study indicates that the space required in practice is much less than the given theoretical bound and the algorithm supports high speed data streams.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Activity within motor areas of the cortex begins to increase 1 to 2 s prior to voluntary self-initiated movement (termed the Bereitschaftspotential or readiness potential). There has been much speculation and debate over the precise source of this early premovement activity as it is important for understanding the roles of higher order motor areas in the preparation and readiness for voluntary movement. In this study, we use high-field (3-T) event-related fMRI with high temporal sampling (partial brain volumes every 250 ms) to specifically examine hemodynamic response time courses during the preparation, readiness, and execution of purely self-initiated voluntary movement. Five right-handed healthy volunteers performed a rapid sequential finger-to-thumb movement performed at self-determined times (12-15 trials). Functional images for each trial were temporally aligned and the averaged time series for each subject was iteratively correlated with a canonical hemodynamic response function progressively shifted in time. This analysis method identified areas of activation without constraining hemodynamic response timing. All subjects showed activation within frontal mesial areas, including supplementary motor area (SMA) and cingulate motor areas, as well as activation in left primary sensorimotor areas. The time courses of hemodynamic responses showed a great deal of variability in shape and timing between subjects; however, four subjects clearly showed earlier relative hemodynamic responses within SMA/cingulate motor areas compared with left primary motor areas. These results provide further evidence that the SMA and cingulate motor areas are major contributors to early stage premovement activity and play an important role in the preparation and readiness for voluntary movement. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A data warehouse is a data repository which collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. One of the most important decisions in designing a data warehouse is the selection of views for materialization. The objective is to select an appropriate set of views that minimizes the total query response time with the constraint that the total maintenance time for these materialized views is within a given bound. This view selection problem is totally different from the view selection problem under the disk space constraint. In this paper the view selection problem under the maintenance time constraint is investigated. Two efficient, heuristic algorithms for the problem are proposed. The key to devising the proposed algorithms is to define good heuristic functions and to reduce the problem to some well-solved optimization problems. As a result, an approximate solution of the known optimization problem will give a feasible solution of the original problem. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the picture-word interference task, naming responses are facilitated when a distractor word is orthographically and phonologically related to the depicted object as compared to an unrelated word. We used event-related functional magnetic resonance imaging (fMRI) to investigate the cerebral hemodynamic responses associated with this priming effect. Serial (or independent-stage) and interactive models of word production that explicitly account for picture-word interference effects assume that the locus of the effect is at the level of retrieving phonological codes, a role attributed recently to the left posterior superior temporal cortex (Wernicke's area). This assumption was tested by randomly presenting participants with trials from orthographically related and unrelated distractor conditions and acquiring image volumes coincident with the estimated peak hemodynamic response for each trial. Overt naming responses occurred in the absence of scanner noise, allowing reaction time data to be recorded. Analysis of this data confirmed the priming effect. Analysis of the fMRI data revealed blood oxygen level-dependent signal decreases in Wernicke's area and the right anterior temporal cortex, whereas signal increases were observed in the anterior cingulate, the right orbitomedial prefrontal, somatosensory, and inferior parietal cortices, and the occipital lobe. The results are interpreted as supporting the locus for the facilitation effect as assumed by both classes of theoretical model of word production. In addition, our results raise the possibilities that, counterintuitively, picture-word interference might be increased by the presentation of orthographically related distractors, due to competition introduced by activation of phonologically related word forms, and that this competition requires inhibitory processes to be resolved. The priming effect is therefore viewed as being sufficient to offset the increased interference. We conclude that information from functional imaging studies might be useful for constraining theoretical models of word production. (C) 2002 Elsevier Science (USA).