964 resultados para Bayesian Phylogenetic Inference


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Indirect inference (II) is a methodology for estimating the parameters of an intractable (generative) model on the basis of an alternative parametric (auxiliary) model that is both analytically and computationally easier to deal with. Such an approach has been well explored in the classical literature but has received substantially less attention in the Bayesian paradigm. The purpose of this paper is to compare and contrast a collection of what we call parametric Bayesian indirect inference (pBII) methods. One class of pBII methods uses approximate Bayesian computation (referred to here as ABC II) where the summary statistic is formed on the basis of the auxiliary model, using ideas from II. Another approach proposed in the literature, referred to here as parametric Bayesian indirect likelihood (pBIL), we show to be a fundamentally different approach to ABC II. We devise new theoretical results for pBIL to give extra insights into its behaviour and also its differences with ABC II. Furthermore, we examine in more detail the assumptions required to use each pBII method. The results, insights and comparisons developed in this paper are illustrated on simple examples and two other substantive applications. The first of the substantive examples involves performing inference for complex quantile distributions based on simulated data while the second is for estimating the parameters of a trivariate stochastic process describing the evolution of macroparasites within a host based on real data. We create a novel framework called Bayesian indirect likelihood (BIL) which encompasses pBII as well as general ABC methods so that the connections between the methods can be established.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The complete internal transcribed spacer 1 (ITS1), 5.8S ribosomal DNA, and ITS2 region of the ribosomal DNA from 60 specimens belonging to two closely related bucephalid digeneans (Dollfustrema vaneyi and Dollfustrema hefeiensis) from different localities, hosts, and microhabitat sites were cloned to examine the level of sequence variation and the taxonomic levels to show utility in species identification and phylogeny estimation. Our data show that these molecular markers can help to discriminate the two species, which are morphologically very close and difficult to separate by classical methods. We found 21 haplotypes defined by 44 polymorphic positions in 38 individuals of D. vaneyi, and 16 haplotypes defined by 43 polymorphic positions in 22 individuals of D. hefeiensis. There is no shared haplotypes between the two species. Haplotype rather than nucleotide diversity is similar between the two species. Phylogenetic analyses reveal two robustly supported clades, one corresponding to D. vaneyi and the other corresponding to D. hefeiensis. However, the population structures between the two species seem to be incongruent and show no geographic and host-specific structure among them, further indicating that the two species may have had a more complex evolutionary history than expected.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Bryaceae are a large cosmopolitan family of mosses containing genera of considerable taxonomic difficulty. Phylogenetic relationships within the family were inferred using data from chloroplast DNA sequences (rps4 and trnL-trnF region). Parsimony and maximum likelihood optimality criteria, and Bayesian phylogenetic inference procedures were employed to reconstruct relationships. The genera Bryum and Brachymenium are not monophyletic groups. A clade comprising Plagiobryum, Acidodontium, Mielichhoferia macrocarpa, Bryum sects. Bryum, Apalodictyon, Limbata, Leucodontium, Caespiticia, Capillaria (in part: sect. Capillaria), and Brachymenium sect. Dicranobryum, is well supported in all analyses and represents a major lineage within the family. Section Dicranobryum of Brachymenium is more closely related to section Bryum than to the other sections of Brachymenium, as are Mielichhoferia macrocarpa and M. himalayana. Species of Acidodontium form a clade with Anomobryum julaceum. The grouping of species with a rosulate gametophytic growth form suggests the presence of a 'rosulate' clade similar in circumscription to the genus Rosulabryum. Mielichhoferia macrocarpa and M. himalayana are transferred to Bryum as B. porsildii and B. caucasicum, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

5S rDNA sequences have proven to be valuable as genetic markers to distinguish closely related species and also in the understanding of the dynamic of repetitive sequences in the genomes. In the aim to contribute to the knowledge of the evolutionary history of Leporinus (Anostomidae) and also to contribute to the understanding of the 5S rDNA sequences organization in the fish genome, analyses of 5S rDNA sequences were conducted in seven species of this genus. The 5S rRNA gene sequence was highly conserved among Leporinus species, whereas NTS exhibit high levels of variations related to insertions, deletions, microrepeats, and base substitutions. The phylogenetic analysis of the 5S rDNA sequences clustered the species into two clades that are in agreement with cytogenetic and morphological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a metachain to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely. [Bayesian phylogenetic inference; heating parameter; Markov chain Monte Carlo; replicated chains.]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acknowledgements We wish to express our gratitude to the National Geographic Society and the National Research Foundation of South Africa for funding the discovery, recovery, and analysis of the H. naledi material. The study reported here was also made possible by grants from the Social Sciences and Humanities Research Council of Canada, the Canada Foundation for Innovation, the British Columbia Knowledge Development Fund, the Canada Research Chairs Program, Simon Fraser University, the DST/NRF Centre of Excellence in Palaeosciences (COE-Pal), as well as by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada, a Young Scientist Development Grant from the Paleontological Scientific Trust (PAST), a Baldwin Fellowship from the L.S.B. Leakey Foundation, and a Seed Grant and a Cornerstone Faculty Fellowship from the Texas A&M University College of Liberal Arts. We would like to thank the South African Heritage Resource Agency for the permits necessary to work on the Rising Star site; the Jacobs family for granting access; Wilma Lawrence, Bonita De Klerk, Merrill Van der Walt, and Justin Mukanku for their assistance during all phases of the project; Lucas Delezene for valuable discussion on the dental characters of H. naledi. We would also like to thank Peter Schmid for the preparation of the Dinaledi fossil material; Yoel Rak for explaining in detail some of the characters used in previous studies; William Kimbel for drawing our attention to the possibility that there might be a problem with Dembo et al.’s (2015) codes for the two characters related to the articular eminence; Will Stein for helpful discussion about the Bayesian analyses; Mike Lee for his comments on this manuscript; John Hawks for his support in organizing the Rising Star workshop; and the associate editor and three anonymous reviewers for their valuable comments. We are grateful to S. Potze and the Ditsong Museum, B. Billings and the School of Anatomical Sciences at the University of the Witwatersrand, and B. Zipfel and the Evolutionary Studies Institute at the University of the Witwatersrand for providing access to the specimens in their care; the University of the Witwatersrand, the Evolutionary Studies Institute, and the South African National Centre of Excellence in PalaeoSciences for hosting a number of the authors while studying the material; and the Western Canada Research Grid for providing access to the high-performance computing facilities for the Bayesian analyses. Last but definitely not least, we thank the head of the Rising Star project, Lee Berger, for his leadership and support, and for encouraging us to pursue the study reported here.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phylogenetic inference consist in the search of an evolutionary tree to explain the best way possible genealogical relationships of a set of species. Phylogenetic analysis has a large number of applications in areas such as biology, ecology, paleontology, etc. There are several criterias which has been defined in order to infer phylogenies, among which are the maximum parsimony and maximum likelihood. The first one tries to find the phylogenetic tree that minimizes the number of evolutionary steps needed to describe the evolutionary history among species, while the second tries to find the tree that has the highest probability of produce the observed data according to an evolutionary model. The search of a phylogenetic tree can be formulated as a multi-objective optimization problem, which aims to find trees which satisfy simultaneously (and as much as possible) both criteria of parsimony and likelihood. Due to the fact that these criteria are different there won't be a single optimal solution (a single tree), but a set of compromise solutions. The solutions of this set are called "Pareto Optimal". To find this solutions, evolutionary algorithms are being used with success nowadays.This algorithms are a family of techniques, which aren’t exact, inspired by the process of natural selection. They usually find great quality solutions in order to resolve convoluted optimization problems. The way this algorithms works is based on the handling of a set of trial solutions (trees in the phylogeny case) using operators, some of them exchanges information between solutions, simulating DNA crossing, and others apply aleatory modifications, simulating a mutation. The result of this algorithms is an approximation to the set of the “Pareto Optimal” which can be shown in a graph with in order that the expert in the problem (the biologist when we talk about inference) can choose the solution of the commitment which produces the higher interest. In the case of optimization multi-objective applied to phylogenetic inference, there is open source software tool, called MO-Phylogenetics, which is designed for the purpose of resolving inference problems with classic evolutionary algorithms and last generation algorithms. REFERENCES [1] C.A. Coello Coello, G.B. Lamont, D.A. van Veldhuizen. Evolutionary algorithms for solving multi-objective problems. Spring. Agosto 2007 [2] C. Zambrano-Vega, A.J. Nebro, J.F Aldana-Montes. MO-Phylogenetics: a phylogenetic inference software tool with multi-objective evolutionary metaheuristics. Methods in Ecology and Evolution. En prensa. Febrero 2016.