900 resultados para Evolutionary trees
Resumo:
The reconstruction of multitaxon trees from molecular sequences is confounded by the variety of algorithms and criteria used to evaluate trees, making it difficult to compare the results of different analyses. A global method of multitaxon phylogenetic reconstruction described here, Bootstrappers Gambit, can be used with any four-taxon algorithm, including distance, maximum likelihood, and parsimony methods. It incorporates a Bayesian-Jeffreys'-bootstrap analysis to provide a uniform probability-based criterion for comparing the results from diverse algorithms. To examine the usefulness of the method, the origin of the eukaryotes has been investigated by the analysis of ribosomal small subunit RNA sequences. Three common algorithms (paralinear distances, Jukes-Cantor distances, and Kimura distances) support the eocyte topology, whereas one (maximum parsimony) supports the archaebacterial topology, suggesting that the eocyte prokaryotes are the closest prokaryotic relatives of the eukaryotes.
Resumo:
Evolutionary trees are often estimated from DNA or RNA sequence data. How much confidence should we have in the estimated trees? In 1985, Felsenstein [Felsenstein, J. (1985) Evolution 39, 783–791] suggested the use of the bootstrap to answer this question. Felsenstein’s method, which in concept is a straightforward application of the bootstrap, is widely used, but has been criticized as biased in the genetics literature. This paper concerns the use of the bootstrap in the tree problem. We show that Felsenstein’s method is not biased, but that it can be corrected to better agree with standard ideas of confidence levels and hypothesis testing. These corrections can be made by using the more elaborate bootstrap method presented here, at the expense of considerably more computation.
Resumo:
Evolutionary trees are often estimated from DNA or RNA sequence data. How much confidence should we have in the estimated trees? In 1985, Felsenstein [Felsenstein, J. (1985) Evolution 39, 783-791] suggested the use of the bootstrap to answer this question. Felsenstein's method, which in concept is a straightforward application of the bootstrap, is widely used, but has been criticized as biased in the genetics literature. This paper concerns the use of the bootstrap in the tree problem. We show that Felsenstein's method is not biased, but that it can be corrected to better agree with standard ideas of confidence levels and hypothesis testing. These corrections can be made by using the more elaborate bootstrap method presented here, at the expense of considerably more computation.
Resumo:
Study of the evolution of species or organisms is essential for various biological applications. Evolution is typically studied at the molecular level by analyzing the mutations of DNA sequences of organisms. Techniques have been developed for building phylogenetic or evolutionary trees for a set of sequences. Though phylogenetic trees capture the overall evolutionary relationships among the sequences, they do not reveal fine-level details of the evolution. In this work, we attempt to resolve various fine-level sequence transformation details associated with a phylogenetic tree using cellular automata. In particular, our work tries to determine the cellular automata rules for neighbor-dependent mutations of segments of DNA sequences. We also determine the number of time steps needed for evolution of a progeny from an ancestor and the unknown segments of the intermediate sequences in the phylogenetic tree. Due to the existence of vast number of cellular automata rules, we have developed a grid system that performs parallel guided explorations of the rules on grid resources. We demonstrate our techniques by conducting experiments on a grid comprising machines in three countries and obtaining potentially useful statistics regarding evolutions in three HIV sequences. In particular, our work is able to verify the phenomenon of neighbor-dependent mutations and find that certain combinations of neighbor-dependent mutations, defined by a cellular automata rule, occur with greater than 90% probability. We also find the average number of time steps for mutations for some branches of phylogenetic tree over a large number of possible transformations with standard deviations less than 2.
Resumo:
R. H. Whittaker's idea that plant diversity can be divided into a hierarchy of spatial components from alpha at the within-habitat scale through beta for the turnover of species between habitats to gamma along regional gradients implies the underlying existence of alpha, beta, and gamma niches. We explore the hypothesis that the evolution of a, (3, and gamma niches is also hierarchical, with traits that define the a niche being labile, while those defining a and 7 niches are conservative. At the a level we find support for the hypothesis in the lack of close significant phylogenetic relationship between meadow species that have similar a niches. In a second test, a niche overlap based on a variety of traits is compared between congeners and noncongeners in several communities; here, too, there is no evidence of a correlation between a niche and phylogeny. To test whether beta and gamma niches evolve conservatively, we reconstructed the evolution of relevant traits on evolutionary trees for 14 different clades. Tests against null models revealed a number of instances, including some in island radiations, in which habitat (beta niche) and elevational maximum (an aspect of the gamma niche) showed evolutionary conservatism.
Resumo:
Patterns of substitution in chloroplast encoded trnL_F regions were compared between species of Actaea (Ranunculales), Digitalis (Scrophulariales), Drosera (Caryophyllales), Panicoideae (Poales), the small chromosome species clade of Pelargonium (Geraniales), each representing a different order of flowering plants, and Huperzia (Lycopodiales). In total, the study included 265 taxa, each with > 900-bp sequences, totaling 0.24 Mb. Both pairwise and phylogeny-based comparisons were used to assess nucleotide substitution patterns. In all six groups, we found that transition/transversion ratios, as estimated by maximum likelihood on most-parsimonious trees, ranged between 0.8 and 1.0 for ingroups. These values occurred both at low sequence divergences, where substitutional saturation, i.e., multiple substitutions having occurred at the same (homologous) nucleotide position, was not expected, and at higher levels of divergence. This suggests that the angiosperm trnL-F regions evolve in a pattern different from that generally observed for nuclear and animal mtDNA (transitional/transversion ratio > or = 2). Transition/transversion ratios in the intron and the spacer region differed in all alignments compared, yet base compositions between the regions were highly similar in all six groups. A>-
Resumo:
The main scope of my PhD is the reconstruction of the large-scale bivalve phylogeny on the basis of four mitochondrial genes, with samples taken from all major groups of the class. To my knowledge, it is the first attempt of such a breadth in Bivalvia. I decided to focus on both ribosomal and protein coding DNA sequences (two ribosomal encoding genes -12s and 16s -, and two protein coding ones - cytochrome c oxidase I and cytochrome b), since either bibliography and my preliminary results confirmed the importance of combined gene signals in improving evolutionary pathways of the group. Moreover, I wanted to propose a methodological pipeline that proved to be useful to obtain robust results in bivalves phylogeny. Actually, best-performing taxon sampling and alignment strategies were tested, and several data partitioning and molecular evolution models were analyzed, thus demonstrating the importance of molding and implementing non-trivial evolutionary models. In the line of a more rigorous approach to data analysis, I also proposed a new method to assess taxon sampling, by developing Clarke and Warwick statistics: taxon sampling is a major concern in phylogenetic studies, and incomplete, biased, or improper taxon assemblies can lead to misleading results in reconstructing evolutionary trees. Theoretical methods are already available to optimize taxon choice in phylogenetic analyses, but most involve some knowledge about genetic relationships of the group of interest, or even a well-established phylogeny itself; these data are not always available in general phylogenetic applications. The method I proposed measures the "phylogenetic representativeness" of a given sample or set of samples and it is based entirely on the pre-existing available taxonomy of the ingroup, which is commonly known to investigators. Moreover, it also accounts for instability and discordance in taxonomies. A Python-based script suite, called PhyRe, has been developed to implement all analyses.
Resumo:
Architecture 2000, publicado por Charles Jencks originalmente en el año 1971, es un libro cuyo objetivo explícito es predecir el futuro de la arquitectura hasta el fi nal del siglo XX y este hecho ha favorecido que se haya editado por segunda vez precisamente ese emblemático año 2000, manteniendo fundamentalmente el contenido original, aunque añadiendo una introducción y un capítulo final en los que se ofrece una evaluación de lo dicho treinta años antes. En este nuevo Architecture 2000, Jencks añade al texto principal sus comentarios sobre lo acertado o erróneo de sus predicciones anteriores, juzgadas ahora a la luz de los hechos acontecidos y actualiza su mapa evolutivo con una nueva versión de su anterior Evolutionary Tree to the year 2000, que ahora pasará a denominarse Evolutionary Tree of the 20th century architecture. No hay duda de que lo más infl uyente y duradero de la obra de Jencks ha sido la representación gráfica de la evolución de la arquitectura a través de sus evolutionary trees. Pero, aunque Architecture 2000 debe ser valorada sobre todo como una aportación imprescindible para entender la arquitectura de la segunda mitad del siglo XX, tanto por su identifi cación de las tendencias a través de los que se han conducido las experiencias de los profesionales como por su concepción dinámica de la arquitectura plasmada gráfi camente a través de sus evolutionary trees, sería un error considerarla como mera historia y dar por amortizados los métodos prospectivos o los juicios y expectativas del autor sobre lo acontecido en la arquitectura a lo largo de las últimas décadas.
Resumo:
In the maximum parsimony (MP) and minimum evolution (ME) methods of phylogenetic inference, evolutionary trees are constructed by searching for the topology that shows the minimum number of mutational changes required (M) and the smallest sum of branch lengths (S), respectively, whereas in the maximum likelihood (ML) method the topology showing the highest maximum likelihood (A) of observing a given data set is chosen. However, the theoretical basis of the optimization principle remains unclear. We therefore examined the relationships of M, S, and A for the MP, ME, and ML trees with those for the true tree by using computer simulation. The results show that M and S are generally greater for the true tree than for the MP and ME trees when the number of nucleotides examined (n) is relatively small, whereas A is generally lower for the true tree than for the ML tree. This finding indicates that the optimization principle tends to give incorrect topologies when n is small. To deal with this disturbing property of the optimization principle, we suggest that more attention should be given to testing the statistical reliability of an estimated tree rather than to finding the optimal tree with excessive efforts. When a reliability test is conducted, simplified MP, ME, and ML algorithms such as the neighbor-joining method generally give conclusions about phylogenetic inference very similar to those obtained by the more extensive tree search algorithms.
Resumo:
The genetic history of a group of populations is usually analyzed by reconstructing a tree of their origins. Reliability of the reconstruction depends on the validity of the hypothesis that genetic differentiation of the populations is mostly due to population fissions followed by independent evolution. If necessary, adjustment for major population admixtures can be made. Dating the fissions requires comparisons with paleoanthropological and paleontological dates, which are few and uncertain. A method of absolute genetic dating recently introduced uses mutation rates as molecular clocks; it was applied to human evolution using microsatellites, which have a sufficiently high mutation rate. Results are comparable with those of other methods and agree with a recent expansion of modern humans from Africa. An alternative method of analysis, useful when there is adequate geographic coverage of regions, is the geographic study of frequencies of alleles or haplotypes. As in the case of trees, it is necessary to summarize data from many loci for conclusions to be acceptable. Results must be independent from the loci used. Multivariate analyses like principal components or multidimensional scaling reveal a number of hidden patterns and evaluate their relative importance. Most patterns found in the analysis of human living populations are likely to be consequences of demographic expansions, determined by technological developments affecting food availability, transportation, or military power. During such expansions, both genes and languages are spread to potentially vast areas. In principle, this tends to create a correlation between the respective evolutionary trees. The correlation is usually positive and often remarkably high. It can be decreased or hidden by phenomena of language replacement and also of gene replacement, usually partial, due to gene flow.
Resumo:
In isolation and characterization studies, expression level U1 and U2 snRNA isoforms were obtained from the 5th instar larval stage silk gland (SG). The DNA content of the SG cells is approximately 200,000-fold higher compared to the usual (2N) somatic cells of B. mori due to endoreduplication. In this study, the existence of U1 and U2 snRNA isoforms in the SG of the organism is investigated. Bombyx mori U1 and U2-specific RT-PCR libraries from the silk gland were generated. Five U1 and eight U2 isoforms were isolated and characterized. Nucleotide differences, structural alterations, as well as protein and RNA interaction sites were analyzed in these variants. For the U1 snRNA variants, they were compared to the previously reported BmN isoforms. In all these U-snRNA variants, polymorphic sites do not predominate at the core of known functional sequences, which were interspecifically conserved. Variant sites and inter-species differences are located in moderately conserved regions. Free energy (ΔG) values for the entire U1 and U2 snRNA secondary structures and for the individual stem/loops domains of the isoforms were generated and compared to determine their structural stability. This will be the first time that U1 and U2 variants are shown specific for a development stage (larval) other than embryonic or adult. ^ Using phylogenetic analysis, evolutionary trees were generated for the U1 and U2 snRNAs using animal, plant, protista and fungal species. The resulting trees were boostrapped for robustness and rooted with the self-splicing RNA group II intron sequence from the cyanobacterium Calothrix. Using phylogenetic analyses, possible structural and functional evolutionary interdependence between the U1 and U2 snRNAs was investigated. ^
Resumo:
Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Finding the degree-constrained minimum spanning tree (DCMST) of a graph is a widely studied NP-hard problem. One of its most important applications is network design. Here we deal with a new variant of the DCMST problem, which consists of finding not only the degree- but also the role-constrained minimum spanning tree (DRCMST), i.e., we add constraints to restrict the role of the nodes in the tree to root, intermediate or leaf node. Furthermore, we do not limit the number of root nodes to one, thereby, generally, building a forest of DRCMSTs. The modeling of network design problems can benefit from the possibility of generating more than one tree and determining the role of the nodes in the network. We propose a novel permutation-based representation to encode these forests. In this new representation, one permutation simultaneously encodes all the trees to be built. We simulate a wide variety of DRCMST problems which we optimize using eight different evolutionary computation algorithms encoding individuals of the population using the proposed representation. The algorithms we use are: estimation of distribution algorithm, generational genetic algorithm, steady-state genetic algorithm, covariance matrix adaptation evolution strategy, differential evolution, elitist evolution strategy, non-elitist evolution strategy and particle swarm optimization. The best results are for the estimation of distribution algorithms and both types of genetic algorithms, although the genetic algorithms are significantly faster.