55 resultados para PHYLOGENETIC INFERENCE
em CentAUR: Central Archive University of Reading - UK
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
We present the first assessment of phylogenetic utility of a potential novel low-copy nuclear gene region in flowering plants. A fragment of the MORE AXILLARY GROWTH 4 gene (MAX4, also known as RAMOSUS1 and DECREASED APICAL DOMINANCE1), predicted to span two introns, was isolated from members of Digitalis/Isoplexis. Phylogenetic analyses, under both maximum parsimony and Bayesian inference, were performed and revealed evidence of putative MAX4-like paralogues. The MAX4-like trees were compared with those obtained for Digitalis/Isoplexis using ITS and trnL-F, revealing a high degree of incongruence between these different DNA regions. Network analyses indicate complex patterns of evolution between the MAX4 sequences, which cannot be adequately represented on bifurcating trees. The incidence of paralogy restricts the use of MAX4 in phylogenetic inference within the study group, although MAX4 could potentially be used in combination with other DNA regions for resolving species relationships in cases where paralogues can be clearly identified.
Resumo:
We present the first assessment of phylogenetic utility of a potential novel low-copy nuclear gene region in flowering plants. A fragment of the MORE AXILLARY GROWTH 4 gene (MAX4, also known as RAMOSUS1 and DECREASED APICAL DOMINANCE1), predicted to span two introns, was isolated from members of Digitalis/Isoplexis. Phylogenetic analyses, under both maximum parsimony and Bayesian inference, were performed and revealed evidence of putative MAX4-like paralogues. The MAX4-like trees were compared with those obtained for Digitalis/Isoplexis using ITS and trnL-F, revealing a high degree of incongruence between these different DNA regions. Network analyses indicate complex patterns of evolution between the MAX4 sequences, which cannot be adequately represented on bifurcating trees. The incidence of paralogy restricts the use of MAX4 in phylogenetic inference within the study group, although MAX4 could potentially be used in combination with other DNA regions for resolving species relationships in cases where paralogues can be clearly identified.
Resumo:
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.
Resumo:
We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.
Resumo:
The Bryaceae are a large cosmopolitan family of mosses containing genera of considerable taxonomic difficulty. Phylogenetic relationships within the family were inferred using data from chloroplast DNA sequences (rps4 and trnL-trnF region). Parsimony and maximum likelihood optimality criteria, and Bayesian phylogenetic inference procedures were employed to reconstruct relationships. The genera Bryum and Brachymenium are not monophyletic groups. A clade comprising Plagiobryum, Acidodontium, Mielichhoferia macrocarpa, Bryum sects. Bryum, Apalodictyon, Limbata, Leucodontium, Caespiticia, Capillaria (in part: sect. Capillaria), and Brachymenium sect. Dicranobryum, is well supported in all analyses and represents a major lineage within the family. Section Dicranobryum of Brachymenium is more closely related to section Bryum than to the other sections of Brachymenium, as are Mielichhoferia macrocarpa and M. himalayana. Species of Acidodontium form a clade with Anomobryum julaceum. The grouping of species with a rosulate gametophytic growth form suggests the presence of a 'rosulate' clade similar in circumscription to the genus Rosulabryum. Mielichhoferia macrocarpa and M. himalayana are transferred to Bryum as B. porsildii and B. caucasicum, respectively.
Resumo:
The node-density effect is an artifact of phylogeny reconstruction that can cause branch lengths to be underestimated in areas of the tree with fewer taxa. Webster, Payne, and Pagel (2003, Science 301:478) introduced a statistical procedure (the "delta" test) to detect this artifact, and here we report the results of computer simulations that examine the test's performance. In a sample of 50,000 random data sets, we find that the delta test detects the artifact in 94.4% of cases in which it is present. When the artifact is not present (n = 10,000 simulated data sets) the test showed a type I error rate of approximately 1.69%, incorrectly reporting the artifact in 169 data sets. Three measures of tree shape or "balance" failed to predict the size of the node-density effect. This may reflect the relative homogeneity of our randomly generated topologies, but emphasizes that nearly any topology can suffer from the artifact, the effect not being confined only to highly unevenly sampled or otherwise imbalanced trees. The ability to screen phylogenies for the node-density artifact is important for phylogenetic inference and for researchers using phylogenetic trees to infer evolutionary processes, including their use in molecular clock dating. [Delta test; molecular clock; molecular evolution; node-density effect; phylogenetic reconstruction; speciation; simulation.]
Resumo:
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.
Resumo:
Background: Concerted evolution is normally used to describe parallel changes at different sites in a genome, but it is also observed in languages where a specific phoneme changes to the same other phoneme in many words in the lexicon—a phenomenon known as regular sound change. We develop a general statistical model that can detect concerted changes in aligned sequence data and apply it to study regular sound changes in the Turkic language family. Results: Linguistic evolution, unlike the genetic substitutional process, is dominated by events of concerted evolutionary change. Our model identified more than 70 historical events of regular sound change that occurred throughout the evolution of the Turkic language family, while simultaneously inferring a dated phylogenetic tree. Including regular sound changes yielded an approximately 4-fold improvement in the characterization of linguistic change over a simpler model of sporadic change, improved phylogenetic inference, and returned more reliable and plausible dates for events on the phylogenies. The historical timings of the concerted changes closely follow a Poisson process model, and the sound transition networks derived from our model mirror linguistic expectations. Conclusions: We demonstrate that a model with no prior knowledge of complex concerted or regular changes can nevertheless infer the historical timings and genealogical placements of events of concerted change from the signals left in contemporary data. Our model can be applied wherever discrete elements—such as genes, words, cultural trends, technologies, or morphological traits—can change in parallel within an organism or other evolving group.
Resumo:
Bayesian inference has been used to determine rigorous estimates of hydroxyl radical concentrations () and air mass dilution rates (K) averaged following air masses between linked observations of nonmethane hydrocarbons (NMHCs) spanning the North Atlantic during the Intercontinental Transport and Chemical Transformation (ITCT)-Lagrangian-2K4 experiment. The Bayesian technique obtains a refined (posterior) distribution of a parameter given data related to the parameter through a model and prior beliefs about the parameter distribution. Here, the model describes hydrocarbon loss through OH reaction and mixing with a background concentration at rate K. The Lagrangian experiment provides direct observations of hydrocarbons at two time points, removing assumptions regarding composition or sources upstream of a single observation. The estimates are sharpened by using many hydrocarbons with different reactivities and accounting for their variability and measurement uncertainty. A novel technique is used to construct prior background distributions of many species, described by variation of a single parameter . This exploits the high correlation of species, related by the first principal component of many NMHC samples. The Bayesian method obtains posterior estimates of , K and following each air mass. Median values are typically between 0.5 and 2.0 × 106 molecules cm−3, but are elevated to between 2.5 and 3.5 × 106 molecules cm−3, in low-level pollution. A comparison of estimates from absolute NMHC concentrations and NMHC ratios assuming zero background (the “photochemical clock” method) shows similar distributions but reveals systematic high bias in the estimates from ratios. Estimates of K are ∼0.1 day−1 but show more sensitivity to the prior distribution assumed.
Resumo:
An investigation into the phylogenetic variation of plant tolerance and the root and shoot uptake of organic contaminants was undertaken. The aim was to determine if particular families or genera were tolerant of, or accumulated organic pollutants. Data were collected from sixty-nine studies. The variation between experiments was accounted for using a residual maximum likelihood analysis to approximate means for individual taxa. A nested ANOVA was subsequently used to determine differences at a number of differing phylogenetic levels. Significant differences were observed at a number of phylogenetic levels for the tolerance to TPH, the root concentration factor and the shoot concentration factor. There was no correlation between the uptake of organic pollutants and that of heavy metals. The data indicate that plant phylogeny is an important influence on both the plant tolerance and uptake of organic pollutants. If this study can be expanded, such information can be used when designing plantings for phytoremediation or risk reduction during the restoration of contaminated sites.
Resumo:
The Upper Jurassic-Lower Cretaceous dragonfly family Tarsophlebiidae is revised. The type species of the type genus Tarsophlebia Hagen, 1866, T eximia (Hagen, 1862) from the Upper Jurassic Solnhofen Limestones, is redescribed, including important new information on its head, legs, wings, anal appendages and male secondary genital apparatus. The type specimen of Tarsophlebiopsis mayi Tillyard, 1923 is regarded as an aberrant or unusually preserved Tarsophlebia eximia. One new species of Tarsophlebia and three new species of Turanophlebia are described, i.e. Tarsophlebia minor n. sp., Turanophlebia anglicana n. sp., T mongolica n. sp., and T. vitimensis n. sp. A new combination is proposed for Turanophlebia neckini (Martynov, 1927) n. comb. The phylogenetic relationships of the Mesozoic Tarsophlebiidae are discussed on the basis of new body and wing venation characters. The present analysis supports a rather derived position for the Tarsophlebiidae, as sister group of the the Epiproctophora rather than of (Zygoptera + Epiproctophora). Also, through the present discussion, the Oligo-Miocene family Sieblosiidae seems to be more closely related to the Epiproctophora than to the Zygoptera. But the present study and previous analyses suffer of the lack of informations concerning the more inclusive groups of Odonatoptera, viz. Protozygoptera, Triadophlebiomorpha, Protanisoptera, etc. The significance of the tarsophlebiid secondary male genital apparatus for the reconstruction of the evolution of odonate copulation is discussed.
Resumo:
Systems Engineering often involves computer modelling the behaviour of proposed systems and their components. Where a component is human, fallibility must be modelled by a stochastic agent. The identification of a model of decision-making over quantifiable options is investigated using the game-domain of Chess. Bayesian methods are used to infer the distribution of players’ skill levels from the moves they play rather than from their competitive results. The approach is used on large sets of games by players across a broad FIDE Elo range, and is in principle applicable to any scenario where high-value decisions are being made under pressure.