2 resultados para Molecular alignment
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The main scope of my PhD is the reconstruction of the large-scale bivalve phylogeny on the basis of four mitochondrial genes, with samples taken from all major groups of the class. To my knowledge, it is the first attempt of such a breadth in Bivalvia. I decided to focus on both ribosomal and protein coding DNA sequences (two ribosomal encoding genes -12s and 16s -, and two protein coding ones - cytochrome c oxidase I and cytochrome b), since either bibliography and my preliminary results confirmed the importance of combined gene signals in improving evolutionary pathways of the group. Moreover, I wanted to propose a methodological pipeline that proved to be useful to obtain robust results in bivalves phylogeny. Actually, best-performing taxon sampling and alignment strategies were tested, and several data partitioning and molecular evolution models were analyzed, thus demonstrating the importance of molding and implementing non-trivial evolutionary models. In the line of a more rigorous approach to data analysis, I also proposed a new method to assess taxon sampling, by developing Clarke and Warwick statistics: taxon sampling is a major concern in phylogenetic studies, and incomplete, biased, or improper taxon assemblies can lead to misleading results in reconstructing evolutionary trees. Theoretical methods are already available to optimize taxon choice in phylogenetic analyses, but most involve some knowledge about genetic relationships of the group of interest, or even a well-established phylogeny itself; these data are not always available in general phylogenetic applications. The method I proposed measures the "phylogenetic representativeness" of a given sample or set of samples and it is based entirely on the pre-existing available taxonomy of the ingroup, which is commonly known to investigators. Moreover, it also accounts for instability and discordance in taxonomies. A Python-based script suite, called PhyRe, has been developed to implement all analyses.
Resumo:
Background: WGS is increasingly used as a first-line diagnostic test for patients with rare genetic diseases such as neurodevelopmental disorders (NDD). Clinical applications require a robust infrastructure to support processing, storage and analysis of WGS data. The identification and interpretation of SVs from WGS data also needs to be improved. Finally, there is a need for a prioritization system that enables downstream clinical analysis and facilitates data interpretation. Here, we present the results of a clinical application of WGS in a cohort of patients with NDD. Methods: We developed highly portable workflows for processing WGS data, including alignment, quality control, and variant calling of SNVs and SVs. A benchmark analysis of state-of-the-art SV detection tools was performed to select the most accurate combination for SV calling. A gene-based prioritization system was also implemented to support variant interpretation. Results: Using a benchmark analysis, we selected the most accurate combination of tools to improve SV detection from WGS data and build a dedicated pipeline. Our workflows were used to process WGS data from 77 NDD patient-parent families. The prioritization system supported downstream analysis and enabled molecular diagnosis in 32% of patients, 25% of which were SVs and suggested a potential diagnosis in 20% of patients, requiring further investigation to achieve diagnostic certainty. Conclusion: Our data suggest that the integration of SNVs and SVs is a main factor that increases diagnostic yield by WGS and show that the adoption of a dedicated pipeline improves the process of variant detection and interpretation.