3 resultados para PHYLOGENETIC INFERENCE

em DRUM (Digital Repository at the University of Maryland)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Causal inference with a continuous treatment is a relatively under-explored problem. In this dissertation, we adopt the potential outcomes framework. Potential outcomes are responses that would be seen for a unit under all possible treatments. In an observational study where the treatment is continuous, the potential outcomes are an uncountably infinite set indexed by treatment dose. We parameterize this unobservable set as a linear combination of a finite number of basis functions whose coefficients vary across units. This leads to new techniques for estimating the population average dose-response function (ADRF). Some techniques require a model for the treatment assignment given covariates, some require a model for predicting the potential outcomes from covariates, and some require both. We develop these techniques using a framework of estimating functions, compare them to existing methods for continuous treatments, and simulate their performance in a population where the ADRF is linear and the models for the treatment and/or outcomes may be misspecified. We also extend the comparisons to a data set of lottery winners in Massachusetts. Next, we describe the methods and functions in the R package causaldrf using data from the National Medical Expenditure Survey (NMES) and Infant Health and Development Program (IHDP) as examples. Additionally, we analyze the National Growth and Health Study (NGHS) data set and deal with the issue of missing data. Lastly, we discuss future research goals and possible extensions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dinoflagellates possess large genomes in which most genes are present in many copies. This has made studies of their genomic organization and phylogenetics challenging. Recent advances in sequencing technology have made deep sequencing of dinoflagellate transcriptomes feasible. This dissertation investigates the genomic organization of dinoflagellates to better understand the challenges of assembling dinoflagellate transcriptomic and genomic data from short read sequencing methods, and develops new techniques that utilize deep sequencing data to identify orthologous genes across a diverse set of taxa. To better understand the genomic organization of dinoflagellates, a genomic cosmid clone of the tandemly repeated gene Alchohol Dehydrogenase (AHD) was sequenced and analyzed. The organization of this clone was found to be counter to prevailing hypotheses of genomic organization in dinoflagellates. Further, a new non-canonical splicing motif was described that could greatly improve the automated modeling and annotation of genomic data. A custom phylogenetic marker discovery pipeline, incorporating methods that leverage the statistical power of large data sets was written. A case study on Stramenopiles was undertaken to test the utility in resolving relationships between known groups as well as the phylogenetic affinity of seven unknown taxa. The pipeline generated a set of 373 genes useful as phylogenetic markers that successfully resolved relationships among the major groups of Stramenopiles, and placed all unknown taxa on the tree with strong bootstrap support. This pipeline was then used to discover 668 genes useful as phylogenetic markers in dinoflagellates. Phylogenetic analysis of 58 dinoflagellates, using this set of markers, produced a phylogeny with good support of all branches. The Suessiales were found to be sister to the Peridinales. The Prorocentrales formed a monophyletic group with the Dinophysiales that was sister to the Gonyaulacales. The Gymnodinales was found to be paraphyletic, forming three monophyletic groups. While this pipeline was used to find phylogenetic markers, it will likely also be useful for finding orthologs of interest for other purposes, for the discovery of horizontally transferred genes, and for the separation of sequences in metagenomic data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An increasing focus in evolutionary biology is on the interplay between mesoscale ecological and evolutionary processes such as population demographics, habitat tolerance, and especially geographic distribution, as potential drivers responsible for patterns of diversification and extinction over geologic time. However, few studies to date connect organismal processes such as survival and reproduction through mesoscale patterns to long-term macroevolutionary trends. In my dissertation, I investigate how mechanism of seed dispersal, mediated through geographic range size, influences diversification rates in the Rosales (Plantae: Anthophyta). In my first chapter, I validate the phylogenetic comparative methods that I use in my second and third chapters. Available state speciation and extinction (SSE) models assumptions about evolution known to be false through fossil data. I show, however, that as long as net diversification rates remain positive – a condition likely true for the Rosales – these violations of SSE’s assumptions do not cause significantly biased results. With SSE methods validated, my second chapter reconstructs three associations that appear to increase diversification rate for Rosalean genera: (1) herbaceous habit; (2) a three-way interaction combining animal dispersal, high within-genus species richness, and geographic range on multiple continents; (3) a four-way interaction combining woody habit with the other three characteristics of (2). I suggest that the three- and four-way interactions represent colonization ability and resulting extinction resistance in the face of late Cenozoic climate change; however, there are other possibilities as well that I hope to investigate in future research. My third chapter reconstructs the phylogeographic history of the Rosales using both non-fossil-assisted SSE methods as well as fossil-informed traditional phylogeographic analysis. Ancestral state reconstructions indicate that the Rosaceae diversified in North America while the other Rosalean families diversified elsewhere, possibly in Eurasia. SSE is able to successfully identify groups of genera that were likely to have been ancestrally widespread, but has poorer taxonomic resolution than methods that use fossil data. In conclusion, these chapters together suggest several potential causal links between organismal, mesoscale, and geologic scale processes, but further work will be needed to test the hypotheses that I raise here.