937 resultados para likelihood-based inference


Relevância:

90.00% 90.00%

Publicador:

Resumo:

We describe and evaluate a new estimator of the effective population size (N-e), a critical parameter in evolutionary and conservation biology. This new "SummStat" N-e. estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer N-e. Simulations of a Wright-Fisher population with known N-e show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and N-e values. We also address the paucity of information about the relative performance of N-e estimators by comparing the SUMMStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated rising initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and Ne less than or equal to 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true N-e. The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any, potentially informative summary statistic from Population genetic data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Studies of ignorance-driven decision making have been employed to analyse when ignorance should prove advantageous on theoretical grounds or else they have been employed to examine whether human behaviour is consistent with an ignorance-driven inference strategy (e. g., the recognition heuristic). In the current study we examine whether-under conditions where such inferences might be expected-the advantages that theoretical analyses predict are evident in human performance data. A single experiment shows that, when asked to make relative wealth judgements, participants reliably use recognition as a basis for their judgements. Their wealth judgements under these conditions are reliably more accurate when some of the target names are unknown than when participants recognize all of the names (a "less-is-more effect"). These results are consistent across a number of variations: the number of options given to participants and the nature of the wealth judgement. A basic model of recognition-based inference predicts these effects.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

“Fast & frugal” heuristics represent an appealing way of implementing bounded rationality and decision-making under pressure. The recognition heuristic is the simplest and most fundamental of these heuristics. Simulation and experimental studies have shown that this ignorance-driven heuristic inference can prove superior to knowledge based inference (Borges, Goldstein, Ortman & Gigerenzer, 1999; Goldstein & Gigerenzer, 2002) and have shown how the heuristic could develop from ACT-R’s forgetting function (Schooler & Hertwig, 2005). Mathematical analyses also demonstrate that, under certain conditions, a “less-is-more effect” will always occur (Goldstein & Gigerenzer, 2002). The further analyses presented in this paper show, however, that these conditions may constitute a special case and that the less-is-more effect in decision-making is subject to the moderating influence of the number of options to be considered and the framing of the question.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We extend the random permutation model to obtain the best linear unbiased estimator of a finite population mean accounting for auxiliary variables under simple random sampling without replacement (SRS) or stratified SRS. The proposed method provides a systematic design-based justification for well-known results involving common estimators derived under minimal assumptions that do not require specification of a functional relationship between the response and the auxiliary variables.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: The temporal and geographical diversification of Neotropical insects remains poorly understood because of the complex changes in geological and climatic conditions that occurred during the Cenozoic. To better understand extant patterns in Neotropical biodiversity, we investigated the evolutionary history of three Neotropical swallowtail Troidini genera (Papilionidae). First, DNA-based species delimitation analyses were conducted to assess species boundaries within Neotropical Troidini using an enlarged fragment of the standard barcode gene. Molecularly delineated species were then used to infer a time-calibrated species-level phylogeny based on a three-gene dataset and Bayesian dating analyses. The corresponding chronogram was used to explore their temporal and geographical diversification through distinct likelihood-based methods. Results: The phylogeny for Neotropical Troidini was well resolved and strongly supported. Molecular dating and biogeographic analyses indicate that the extant lineages of Neotropical Troidini have a late Eocene (33-42 Ma) origin in North America. Two independent lineages (Battus and Euryades + Parides) reached South America via the GAARlandia temporary connection, and later became extinct in North America. They only began substantive diversification during the early Miocene in Amazonia. Macroevolutionary analysis supports the "museum model" of diversification, rather than Pleistocene refugia, as the best explanation for the diversification of these lineages. Conclusions: This study demonstrates that: (i) current Neotropical biodiversity may have originated ex situ; (ii) the GAARlandia bridge was important in facilitating invasions of South America; (iii) colonization of Amazonia initiated the crown diversification of these swallowtails; and (iv) Amazonia is not only a species-rich region but also acted as a sanctuary for the dynamics of this diversity. In particular, Amazonia probably allowed the persistence of old lineages and contributed to the steady accumulation of diversity over time with constant net diversification rates, a result that contrasts with previous studies on other South American butterflies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Heterogeneous datasets arise naturally in most applications due to the use of a variety of sensors and measuring platforms. Such datasets can be heterogeneous in terms of the error characteristics and sensor models. Treating such data is most naturally accomplished using a Bayesian or model-based geostatistical approach; however, such methods generally scale rather badly with the size of dataset, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real datasets. Limitations and extensions are discussed. © 2010 Elsevier Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A novel karyotype with 2n = 50, FN = 48, was described for specimens of Thaptomys collected at Una, State of Bahia, Brazil, which are morphologically indistinguishable from Thaptomys nigrita, 2n = 52, FN = 52, found in other localities. It was hence proposed that the 2n = 50 karyotype could belong to a distinct species, cryptic of Thaptomys nigrita, once chromosomal rearrangements observed, along with the geographic distance, might represent a reproductive barrier between both forms. Phylogenetic analyses using maximum parsimony and maximum likelihood based on partial cytochrome b sequences with 1077 bp were performed, attempting to establish the relationships among the individuals with distinct karyotypes along the geographic distribution of the genus; the sample comprised 18 karyotyped specimens of Thaptomys, encompassing 15 haplotypes, from eight different localities of the Atlantic Rainforest. The intra-generic relationships corroborated the distinct diploid numbers, once both phylogenetic reconstructions recovered two monophyletic lineages, a northeastern clade grouping the 2n = 50 and a southeastern clade with three subclades, grouping the 2n = 52 karyotype. The sequence divergence observed between their individuals ranged from 1.9% to 3.5%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The phylogeny of the Australian legume genus Daviesia was estimated using sequences of the internal transcribed spacers of nuclear ribosomal DNA. Partial congruence was found with previous analyses using morphology, including strong support for monophyly of the genus and for a sister group relationship between the clade D. pachyloma and the rest of the genus. A previously unplaced bird-pollinated species, anceps + D. D. epiphyllum, was well supported as sister to the only other bird-pollinated species in the genus, D. speciosa, indicating a single origin of bird pollination in their common ancestor. Other morphological groups within Daviesia were not supported and require reassessment. A strong and previously unreported sister clade of Daviesia consists of the two monotypic genera Erichsenia and Viminaria. These share phyllode-like leaves and indehiscent fruits. The evolutionary history of cord roots, which have anomalous secondary thickening, was explored using parsimony. Cord roots are limited to three separate clades but have a complex history involving a small number of gains (most likely 0-3) and losses (0-5). The anomalous structure of cord roots ( adventitious vascular strands embedded in a parenchymatous matrix) may facilitate nutrient storage, and the roots may be contractile. Both functions may be related to a postfire resprouting adaptation. Alternatively, cord roots may be an adaptation to the low-nutrient lateritic soils of Western Australia. However, tests for association between root type, soil type, and growth habit were equivocal, depending on whether the variables were treated as phylogenetically dependent (insignificant) or independent ( significant).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We use a threshold seemingly unrelated regressions specification to assess whether the Central and East European countries (CEECs) are synchronized in their business cycles to the Euro-area. This specification is useful in two ways: First, it takes into account the common institutional factors and the similarities across CEECs in their process of economic transition. Second, it captures business cycle asymmetries by allowing for the presence of two distinct regimes for the CEECs. As the CEECs are strongly affected by the Euro-area these regimes may be associated with Euro-area expansions and contractions. We discuss representation, estimation by maximum likelihood and inference. The methodology is illustrated by using monthly industrial production in 8 CEECs. The results show that apart from Lithuania the rest of the CEECs experience “normal” growth when the Euro-area contracts and “high” growth when the Euro-area expands. Given that the CEECs are “catching up” with the Euro-area this result shows that most CEECs seem synchronized to the Euro-area cycle. Keywords: Threshold SURE; asymmetry; business cycles; CEECs. JEL classification: C33; C50; E32.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we develop methods for estimation and forecasting in large timevarying parameter vector autoregressive models (TVP-VARs). To overcome computational constraints with likelihood-based estimation of large systems, we rely on Kalman filter estimation with forgetting factors. We also draw on ideas from the dynamic model averaging literature and extend the TVP-VAR so that its dimension can change over time. A final extension lies in the development of a new method for estimating, in a time-varying manner, the parameter(s) of the shrinkage priors commonly-used with large VARs. These extensions are operationalized through the use of forgetting factor methods and are, thus, computationally simple. An empirical application involving forecasting inflation, real output, and interest rates demonstrates the feasibility and usefulness of our approach.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

CodeML (part of the PAML package) im- plements a maximum likelihood-based approach to de- tect positive selection on a specific branch of a given phylogenetic tree. While CodeML is widely used, it is very compute-intensive. We present SlimCodeML, an optimized version of CodeML for the branch-site model. Our performance analysis shows that SlimCodeML substantially outperforms CodeML (up to 9.38 times faster), especially for large-scale genomic analyses.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We study the contribution of money to business cycle fluctuations in the US,the UK, Japan, and the Euro area using a small scale structural monetary business cycle model. Constrained likelihood-based estimates of the parameters areprovided and time instabilities analyzed. Real balances are statistically importantfor output and inflation fluctuations. Their contribution changes over time. Models giving money no role provide a distorted representation of the sources of cyclicalfluctuations, of the transmission of shocks and of the events of the last 40 years.