994 resultados para statistical genetics


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding the complexities that are involved in the genetics of multifactorial diseases is still a monumental task. In addition to environmental factors that can influence the risk of disease, there is also a number of other complicating factors. Genetic variants associated with age of disease onset may be different from those variants associated with overall risk of disease, and variants may be located in positions that are not consistent with the traditional protein coding genetic paradigm. Latent Variable Models are well suited for the analysis of genetic data. A latent variable is one that we do not directly observe, but which is believed to exist or is included for computational or analytic convenience in a model. This thesis presents a mixture of methodological developments utilising latent variables, and results from case studies in genetic epidemiology and comparative genomics. Epidemiological studies have identified a number of environmental risk factors for appendicitis, but the disease aetiology of this oft thought useless vestige remains largely a mystery. The effects of smoking on other gastrointestinal disorders are well documented, and in light of this, the thesis investigates the association between smoking and appendicitis through the use of latent variables. By utilising data from a large Australian twin study questionnaire as both cohort and case-control, evidence is found for the association between tobacco smoking and appendicitis. Twin and family studies have also found evidence for the role of heredity in the risk of appendicitis. Results from previous studies are extended here to estimate the heritability of age-at-onset and account for the eect of smoking. This thesis presents a novel approach for performing a genome-wide variance components linkage analysis on transformed residuals from a Cox regression. This method finds evidence for a dierent subset of genes responsible for variation in age at onset than those associated with overall risk of appendicitis. Motivated by increasing evidence of functional activity in regions of the genome once thought of as evolutionary graveyards, this thesis develops a generalisation to the Bayesian multiple changepoint model on aligned DNA sequences for more than two species. This sensitive technique is applied to evaluating the distributions of evolutionary rates, with the finding that they are much more complex than previously apparent. We show strong evidence for at least 9 well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least 7 classes in an alignment of four mammals, including human. A pattern of enrichment and depletion of genic regions in the profiled segments suggests they are functionally significant, and most likely consist of various functional classes. Furthermore, a method of incorporating alignment characteristics representative of function such as GC content and type of mutation into the segmentation model is developed within this thesis. Evidence of fine-structured segmental variation is presented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision.  Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes.  The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).

Relevância:

70.00% 70.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

As the development of genotyping and next-generation sequencing technologies, multi-marker testing in genome-wide association study and rare variant association study became active research areas in statistical genetics. This dissertation contains three methodologies for association study by exploring different genetic data features and demonstrates how to use those methods to test genetic association hypothesis. The methods can be categorized into in three scenarios: 1) multi-marker testing for strong Linkage Disequilibrium regions, 2) multi-marker testing for family-based association studies, 3) multi-marker testing for rare variant association study. I also discussed the advantage of using these methods and demonstrated its power by simulation studies and applications to real genetic data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

2000 Mathematics Subject Classi cation: 62N01, 62N05, 62P10, 92D10, 92D30.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new deterministic method for predicting simultaneous inbreeding coefficients at three and four loci is presented. The method involves calculating the conditional probability of IBD (identical by descent) at one locus given IBD at other loci, and multiplying this probability by the prior probability of the latter loci being simultaneously IBD. The conditional probability is obtained applying a novel regression model, and the prior probability from the theory of digenic measures of Weir and Cockerham. The model was validated for a finite monoecious population mating at random, with a constant effective population size, and with or without selfing, and also for an infinite population with a constant intermediate proportion of selfing. We assumed discrete generations. Deterministic predictions were very accurate when compared with simulation results, and robust to alternative forms of implementation. These simultaneous inbreeding coefficients were more sensitive to changes in effective population size than in marker spacing. Extensions to predict simultaneous inbreeding coefficients at more than four loci are now possible.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ce mémoire analyse l’espérance du temps de fixation conditionnellement à ce qu’elle se produise et la probabilité de fixation d’un nouvel allèle mutant dans des populations soumises à différents phénomènes biologiques en uti- lisant l’approche des processus ancestraux. Tout d’abord, l’article de Tajima (1990) est analysé et les différentes preuves y étant manquantes ou incomplètes sont détaillées, dans le but de se familiariser avec les calculs du temps de fixa- tion. L’étude de cet article permet aussi de démontrer l’importance du temps de fixation sur certains phénomènes biologiques. Par la suite, l’effet de la sé- lection naturelle est introduit au modèle. L’article de Mano (2009) cite un ré- sultat intéressant quant à l’espérance du temps de fixation conditionnellement à ce que celle-ci survienne qui utilise une approximation par un processus de diffusion. Une nouvelle méthode utilisant le processus ancestral est présentée afin d’arriver à une bonne approximation de ce résultat. Des simulations sont faites afin de vérifier l’exactitude de la nouvelle approche. Finalement, un mo- dèle soumis à la conversion génique est analysé, puisque ce phénomène, en présence de biais, a un effet similaire à celui de la sélection. Nous obtenons finalement un résultat analytique pour la probabilité de fixation d’un nouveau mutant dans la population. Enfin, des simulations sont faites afin de détermi- nerlaprobabilitédefixationainsiqueletempsdefixationconditionnellorsque les taux sont trop grands pour pouvoir les calculer analytiquement.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La butirilcolinesterasa humana (BChE; EC 3.1.1.8) es una enzima polimórfica sintetizada en el hígado y en el tejido adiposo, ampliamente distribuida en el organismo y encargada de hidrolizar algunos ésteres de colina como la procaína, ésteres alifáticos como el ácido acetilsalicílico, fármacos como la metilprednisolona, el mivacurium y la succinilcolina y drogas de uso y/o abuso como la heroína y la cocaína. Es codificada por el gen BCHE (OMIM 177400), habiéndose identificado más de 100 variantes, algunas no estudiadas plenamente, además de la forma más frecuente, llamada usual o silvestre. Diferentes polimorfismos del gen BCHE se han relacionado con la síntesis de enzimas con niveles variados de actividad catalítica. Las bases moleculares de algunas de esas variantes genéticas han sido reportadas, entre las que se encuentra las variantes Atípica (A), fluoruro-resistente del tipo 1 y 2 (F-1 y F-2), silente (S), Kalow (K), James (J) y Hammersmith (H). En este estudio, en un grupo de pacientes se aplicó el instrumento validado Lifetime Severity Index for Cocaine Use Disorder (LSI-C) para evaluar la gravedad del consumo de “cocaína” a lo largo de la vida. Además, se determinaron Polimorfismos de Nucleótido Simple (SNPs) en el gen BCHE conocidos como responsables de reacciones adversas en pacientes consumidores de “cocaína” mediante secuenciación del gen y se predijo el efecto delos SNPs sobre la función y la estructura de la proteína, mediante el uso de herramientas bio-informáticas. El instrumento LSI-C ofreció resultados en cuatro dimensiones: consumo a lo largo de la vida, consumo reciente, dependencia psicológica e intento de abandono del consumo. Los estudios de análisis molecular permitieron observar dos SNPs codificantes (cSNPs) no sinónimos en el 27.3% de la muestra, c.293A>G (p.Asp98Gly) y c.1699G>A (p.Ala567Thr), localizados en los exones 2 y 4, que corresponden, desde el punto de vista funcional, a la variante Atípica (A) [dbSNP: rs1799807] y a la variante Kalow (K) [dbSNP: rs1803274] de la enzima BChE, respectivamente. Los estudios de predicción In silico establecieron para el SNP p.Asp98Gly un carácter patogénico, mientras que para el SNP p.Ala567Thr, mostraron un comportamiento neutro. El análisis de los resultados permite proponer la existencia de una relación entre polimorfismos o variantes genéticas responsables de una baja actividad catalítica y/o baja concentración plasmática de la enzima BChE y algunas de las reacciones adversas ocurridas en pacientes consumidores de cocaína.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study of the association between two random variables that have a joint normal distribution is of interest in applied statistics; for example, in statistical genetics. This article, targeted to applied statisticians, addresses inferences about the coefficient of correlation (ρ) in the bivariate normal and standard bivariate normal distributions using likelihood, frequentist, and Baycsian perspectives. Some results are surprising. For instance, the maximum likelihood estimator and the posterior distribution of ρ in the standard bivariate normal distribution do not follow directly from results for a general bivariate normal distribution. An example employing bootstrap and rejection sampling procedures is used to illustrate some of the peculiarities.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical methodology was applied to a survey of time-course incidence of four viruses (alfalfa mosaic virus, clover yellow vein virus, subterranean clover mottle virus and subterranean clover red leaf virus) in improved pastures in southern regions of Australia. -from Authors

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Posttraumatic stress disorder (PTSD) is a complex syndrome that occurs following exposure to a potentially life threatening traumatic event. This review summarises the literature on the genetics of PTSD including gene–environment interactions (GxE), epigenetics and genetics of treatment response. Numerous genes have been shown to be associated with PTSD using candidate gene approaches. Genome-wide association studies have been limited due to the large sample size required to reach statistical power. Studies have shown that GxE interactions are important for PTSD susceptibility. Epigenetics plays an important role in PTSD susceptibility and some of the most promising studies show stress and child abuse trigger epigenetic changes. Much of the molecular genetics of PTSD remains to be elucidated. However, it is clear that identifying genetic markers and environmental triggers has the potential to advance early PTSD diagnosis and therapeutic interventions and ultimately ease the personal and financial burden of this debilitating disorder.