974 resultados para Data Models
Resumo:
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.
Resumo:
With the current concern over climate change, descriptions of how rainfall patterns are changing over time can be useful. Observations of daily rainfall data over the last few decades provide information on these trends. Generalized linear models are typically used to model patterns in the occurrence and intensity of rainfall. These models describe rainfall patterns for an average year but are more limited when describing long-term trends, particularly when these are potentially non-linear. Generalized additive models (GAMS) provide a framework for modelling non-linear relationships by fitting smooth functions to the data. This paper describes how GAMS can extend the flexibility of models to describe seasonal patterns and long-term trends in the occurrence and intensity of daily rainfall using data from Mauritius from 1962 to 2001. Smoothed estimates from the models provide useful graphical descriptions of changing rainfall patterns over the last 40 years at this location. GAMS are particularly helpful when exploring non-linear relationships in the data. Care is needed to ensure the choice of smooth functions is appropriate for the data and modelling objectives. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.
Resumo:
Combinations of drugs are increasingly being used for a wide variety of diseases and conditions. A pre-clinical study may allow the investigation of the response at a large number of dose combinations. In determining the response to a drug combination, interest may lie in seeking evidence of synergism, in which the joint action is greater than the actions of the individual drugs, or of antagonism, in which it is less. Two well-known response surface models representing no interaction are Loewe additivity and Bliss independence, and Loewe or Bliss synergism or antagonism is defined relative to these. We illustrate an approach to fitting these models for the case in which the marginal single drug dose-response relationships are represented by four-parameter logistic curves with common upper and lower limits, and where the response variable is normally distributed with a common variance about the dose-response curve. When the dose-response curves are not parallel, the relative potency of the two drugs varies according to the magnitude of the desired effect and the models for Loewe additivity and synergism/antagonism cannot be explicitly expressed. We present an iterative approach to fitting these models without the assumption of parallel dose-response curves. A goodness-of-fit test based on residuals is also described. Implementation using the SAS NLIN procedure is illustrated using data from a pre-clinical study. Copyright © 2007 John Wiley & Sons, Ltd.
Resumo:
Motivation: We compare phylogenetic approaches for inferring functional gene links. The approaches detect independent instances of the correlated gain and loss of pairs of genes from species' genomes. We investigate the effect on results of basing evidence of correlations on two phylogenetic approaches, Dollo parsminony and maximum likelihood (ML). We further examine the effect of constraining the ML model by fixing the rate of gene gain at a low value, rather than estimating it from the data. Results: We detect correlated evolution among a test set of pairs of yeast (Saccharomyces cerevisiae) genes, with a case study of 21 eukaryotic genomes and test data derived from known yeast protein complexes. If the rate at which genes are gained is constrained to be low, ML achieves by far the best results at detecting known functional links. The model then has fewer parameters but it is more realistic by preventing genes from being gained more than once. Availability: BayesTraits by M. Pagel and A. Meade, and a script to configure and repeatedly launch it by D. Barker and M. Pagel, are available at http://www.evolution.reading.ac.uk .
Resumo:
Stephens and Donnelly have introduced a simple yet powerful importance sampling scheme for computing the likelihood in population genetic models. Fundamental to the method is an approximation to the conditional probability of the allelic type of an additional gene, given those currently in the sample. As noted by Li and Stephens, the product of these conditional probabilities for a sequence of draws that gives the frequency of allelic types in a sample is an approximation to the likelihood, and can be used directly in inference. The aim of this note is to demonstrate the high level of accuracy of "product of approximate conditionals" (PAC) likelihood when used with microsatellite data. Results obtained on simulated microsatellite data show that this strategy leads to a negligible bias over a wide range of the scaled mutation parameter theta. Furthermore, the sampling variance of likelihood estimates as well as the computation time are lower than that obtained with importance sampling on the whole range of theta. It follows that this approach represents an efficient substitute to IS algorithms in computer intensive (e.g. MCMC) inference methods in population genetics. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
An appropriate model of recent human evolution is not only important to understand our own history, but it is necessary to disentangle the effects of demography and selection on genome diversity. Although most genetic data support the view that our species originated recently in Africa, it is still unclear if it completely replaced former members of the Homo genus, or if some interbreeding occurred during its range expansion. Several scenarios of modern human evolution have been proposed on the basis of molecular and paleontological data, but their likelihood has never been statistically assessed. Using DNA data from 50 nuclear loci sequenced in African, Asian and Native American samples, we show here by extensive simulations that a simple African replacement model with exponential growth has a higher probability (78%) as compared with alternative multiregional evolution or assimilation scenarios. A Bayesian analysis of the data under this best supported model points to an origin of our species approximate to 141 thousand years ago (Kya), an exit out-of-Africa approximate to 51 Kya, and a recent colonization of the Americas approximate to 10.5 Kya. We also find that the African replacement model explains not only the shallow ancestry of mtDNA or Y-chromosomes but also the occurrence of deep lineages at some autosomal loci, which has been formerly interpreted as a sign of interbreeding with Homo erectus.
Resumo:
We focus on the comparison of three statistical models used to estimate the treatment effect in metaanalysis when individually pooled data are available. The models are two conventional models, namely a multi-level and a model based upon an approximate likelihood, and a newly developed model, the profile likelihood model which might be viewed as an extension of the Mantel-Haenszel approach. To exemplify these methods, we use results from a meta-analysis of 22 trials to prevent respiratory tract infections. We show that by using the multi-level approach, in the case of baseline heterogeneity, the number of clusters or components is considerably over-estimated. The approximate and profile likelihood method showed nearly the same pattern for the treatment effect distribution. To provide more evidence two simulation studies are accomplished. The profile likelihood can be considered as a clear alternative to the approximate likelihood model. In the case of strong baseline heterogeneity, the profile likelihood method shows superior behaviour when compared with the multi-level model. Copyright (C) 2006 John Wiley & Sons, Ltd.
Resumo:
Models of windblown pollen or spore movement are required to predict gene flow from genetically modified (GM) crops and the spread of fungal diseases. We suggest a simple form for a function describing the distance moved by a pollen grain or fungal spore, for use in generic models of dispersal. The function has power-law behaviour over sub-continental distances. We show that air-borne dispersal of rapeseed pollen in two experiments was inconsistent with an exponential model, but was fitted by power-law models, implying a large contribution from distant fields to the catches observed. After allowance for this 'background' by applying Fourier transforms to deconvolve the mixture of distant and local sources, the data were best fit by power-laws with exponents between 1.5 and 2. We also demonstrate that for a simple model of area sources, the median dispersal distance is a function of field radius and that measurement from the source edge can be misleading. Using an inverse-square dispersal distribution deduced from the experimental data and the distribution of rapeseed fields deduced by remote sensing, we successfully predict observed rapeseed pollen density in the city centres of Derby and Leicester (UK).
Recent developments in genetic data analysis: what can they tell us about human demographic history?
Resumo:
Over the last decade, a number of new methods of population genetic analysis based on likelihood have been introduced. This review describes and explains the general statistical techniques that have recently been used, and discusses the underlying population genetic models. Experimental papers that use these methods to infer human demographic and phylogeographic history are reviewed. It appears that the use of likelihood has hitherto had little impact in the field of human population genetics, which is still primarily driven by more traditional approaches. However, with the current uncertainty about the effects of natural selection, population structure and ascertainment of single-nucleotide polymorphism markers, it is suggested that likelihood-based methods may have a greater impact in the future.
Resumo:
Accelerated failure time models with a shared random component are described, and are used to evaluate the effect of explanatory factors and different transplant centres on survival times following kidney transplantation. Different combinations of the distribution of the random effects and baseline hazard function are considered and the fit of such models to the transplant data is critically assessed. A mixture model that combines short- and long-term components of a hazard function is then developed, which provides a more flexible model for the hazard function. The model can incorporate different explanatory variables and random effects in each component. The model is straightforward to fit using standard statistical software, and is shown to be a good fit to the transplant data. Copyright (C) 2004 John Wiley Sons, Ltd.
Resumo:
1. Demographic models are assuming an important role in management decisions for endangered species. Elasticity analysis and scope for management analysis are two such applications. Elasticity analysis determines the vital rates that have the greatest impact on population growth. Scope for management analysis examines the effects that feasible management might have on vital rates and population growth. Both methods target management in an attempt to maximize population growth. 2. The Seychelles magpie robin Copsychus sechellarum is a critically endangered island endemic, the population of which underwent significant growth in the early 1990s following the implementation of a recovery programme. We examined how the formal use of elasticity and scope for management analyses might have shaped management in the recovery programme, and assessed their effectiveness by comparison with the actual population growth achieved. 3. The magpie robin population doubled from about 25 birds in 1990 to more than 50 by 1995. A simple two-stage demographic model showed that this growth was driven primarily by a significant increase in the annual survival probability of first-year birds and an increase in the birth rate. Neither the annual survival probability of adults nor the probability of a female breeding at age 1 changed significantly over time. 4. Elasticity analysis showed that the annual survival probability of adults had the greatest impact on population growth. There was some scope to use management to increase survival, but because survival rates were already high (> 0.9) this had a negligible effect on population growth. Scope for management analysis showed that significant population growth could have been achieved by targeting management measures at the birth rate and survival probability of first-year birds, although predicted growth rates were lower than those achieved by the recovery programme when all management measures were in place (i.e. 1992-95). 5. Synthesis and applications. We argue that scope for management analysis can provide a useful basis for management but will inevitably be limited to some extent by a lack of data, as our study shows. This means that identifying perceived ecological problems and designing management to alleviate them must be an important component of endangered species management. The corollary of this is that it will not be possible or wise to consider only management options for which there is a demonstrable ecological benefit. Given these constraints, we see little role for elasticity analysis because, when data are available, a scope for management analysis will always be of greater practical value and, when data are lacking, precautionary management demands that as many perceived ecological problems as possible are tackled.
Resumo:
In this review we describe how concepts of shoot apical meristem function have developed over time. The role of the scientist is emphasized, as proposer, receiver and evaluator of ideas about the shoot apical meristem. Models have become increasingly popular over the last 250 years, and we consider their role. They provide valuable grounding for the development of hypotheses, but in addition they have a strong human element and their uptake relies on various degrees of persuasion. The most influential models are probably those that most data support, consolidating them as an insight into reality; but they also work by altering how we see meristems, re-directing us to influence the data we collect and the questions we consider meaningful.