947 resultados para genetics, statistical genetics, variable models


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision.  Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes.  The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Variable number of tandem repeats (VNTR) are genetic loci at which short sequence motifs are found repeated different numbers of times among chromosomes. To explore the potential utility of VNTR loci in evolutionary studies, I have conducted a series of studies to address the following questions: (1) What are the population genetic properties of these loci? (2) What are the mutational mechanisms of repeat number change at these loci? (3) Can DNA profiles be used to measure the relatedness between a pair of individuals? (4) Can DNA fingerprint be used to measure the relatedness between populations in evolutionary studies? (5) Can microsatellite and short tandem repeat (STR) loci which mutate stepwisely be used in evolutionary analyses?^ A large number of VNTR loci typed in many populations were studied by means of statistical methods developed recently. The results of this work indicate that there is no significant departure from Hardy-Weinberg expectation (HWE) at VNTR loci in most of the human populations examined, and the departure from HWE in some VNTR loci are not solely caused by the presence of population sub-structure.^ A statistical procedure is developed to investigate the mutational mechanisms of VNTR loci by studying the allele frequency distributions of these loci. Comparisons of frequency distribution data on several hundreds VNTR loci with the predictions of two mutation models demonstrated that there are differences among VNTR loci grouped by repeat unit sizes.^ By extending the ITO method, I derived the distribution of the number of shared bands between individuals with any kinship relationship. A maximum likelihood estimation procedure is proposed to estimate the relatedness between individuals from the observed number of shared bands between them.^ It was believed that classical measures of genetic distance are not applicable to analysis of DNA fingerprints which reveal many minisatellite loci simultaneously in the genome, because the information regarding underlying alleles and loci is not available. I proposed a new measure of genetic distance based on band sharing between individuals that is applicable to DNA fingerprint data.^ To address the concern that microsatellite and STR loci may not be useful for evolutionary studies because of the convergent nature of their mutation mechanisms, by a theoretical study as well as by computer simulation, I conclude that the possible bias caused by the convergent mutations can be corrected, and a novel measure of genetic distance that makes the correction is suggested. In summary, I conclude that hypervariable VNTR loci are useful in evolutionary studies of closely related populations or species, especially in the study of human evolution and the history of geographic dispersal of Homo sapiens. (Abstract shortened by UMI.) ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Red cell number and size increase during puberty, particularly in males. The aim of the present study was to determine whether expression of genes affecting red cell indices varied with age and sex. Haemoglobin, red cell count, and mean cellular volume were measured longitudinally on 578 pairs of twins at twelve, fourteen and sixteen years of age. Data were analysed using a structural equation modeling approach, in which a variety of univariate and longitudinal simplex models were fitted to the data. Significant heritability was demonstrated for all variables across all ages. The genes involved did not differ between the sexes, although there was evidence for sex limitation in the case of haemoglobin at age twelve. Longitudinal analyses indicated that new genes affecting red cell indices were expressed at different stages of puberty. Some of these genes affected the different red cell indices pleiotropically, while others had effects specific to one variable only.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A workshop recently held at the Ecole Polytechnique Federale de Lausanne (EPFL, Switzerland) was dedicated to understanding the genetic basis of adaptive change, taking stock of the different approaches developed in theoretical population genetics and landscape genomics and bringing together knowledge accumulated in both research fields. Indeed, an important challenge in theoretical population genetics is to incorporate effects of demographic history and population structure. But important design problems (e.g. focus on populations as units, focus on hard selective sweeps, no hypothesis-based framework in the design of the statistical tests) reduce their capability of detecting adaptive genetic variation. In parallel, landscape genomics offers a solution to several of these problems and provides a number of advantages (e.g. fast computation, landscape heterogeneity integration). But the approach makes several implicit assumptions that should be carefully considered (e.g. selection has had enough time to create a functional relationship between the allele distribution and the environmental variable, or this functional relationship is assumed to be constant). To address the respective strengths and weaknesses mentioned above, the workshop brought together a panel of experts from both disciplines to present their work and discuss the relevance of combining these approaches, possibly resulting in a joint software solution in the future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Complex genetic models and segregation analysis were applied to family data obtained in a hyperendemic goiter area in Brazil. The single locus and Falconer's models did not fit the data. Edward's model showed convergency, but statistical concordance has not been obtained. Although the genetic load model explains statistically the family data, it would be hard to imagine that endemic goiter could be explained by a model where synergism among genetic and environmental factors is not assumed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The western spread of raccoon rabies in Alabama has been slow and even appears to regress eastward periodically. While the disease has been present in the state for over 30 years, areas in northwest Alabama are devoid of raccoon rabies. This variation resulting in an enzootic area of raccoon rabies primarily in southeastern Alabama may be due to landscape features that hinder the movement of raccoons (i.e., gene flow) among different locations. We used 11 raccoon-specific microsatellite markers to obtain individual genotypes to examine gene flow among areas that were rabies free, enzootic with rabies, or had only sporadic reports of the disease. Samples from 70 individuals were collected from 5 sampling localities in 3 counties. The landscape feature data were collected from geographic information system (GIS) data. We inferred gene flow by estimating FST and by using Bayesian tests to identify genetic clusters. Estimates of pairwise FST indicated genetic differentiation and restricted gene flow between some sites, and an uneven distribution of genetic clusters was observed. Of the landscape features examined (i.e., land cover, elevation, slope, roads, and hydrology), only land cover had an association with genetic differentiation, suggesting this landscape variable may affect gene flow among raccoon populations and thus the spread of raccoon variant of rabies in Alabama.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bacillus anthracis, the etiological agent of anthrax, manifests a particular bimodal lifestyle. This bacterial species alternates between short replication phases of 20-40 generations that strictly require infection of the host, normally causing death, interrupted by relatively long, mostly dormant phases as spores in the environment. Hence, the B. anthracis genome is highly homogeneous. This feature and the fact that strains from nearly all parts of the world have been analysed for canonical single nucleotide polymorphisms (canSNPs) and variable number tandem repeats (VNTRs) has allowed the development of molecular epidemiological and molecular clock models to estimate the age of major diversifications in the evolution of B. anthracis and to trace the global spread of this pathogen, which was mostly promoted by movement of domestic cattle with settlers and by international trade of contaminated animal products. From a taxonomic and phylogenetic point of view, B. anthracis is a member of the Bacillus cereus group. The differentiation of B. anthracis from B. cereus sensu strict, solely based on chromosomal markers, is difficult. However, differences in pathogenicity clearly differentiate B. anthracis from B. cereus and are marked by the strict presence of virulence genes located on the two virulence plasmids pXO1 and pXO2, which both are required by the bacterium to cause anthrax. Conversely, anthrax-like symptoms can also be caused by organisms with chromosomal features that are more closely related to B. cereus, but which carry these virulence genes on two plasmids that largely resemble the B. anthracis virulence plasmids. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

There is overwhelming evidence for the existence of substantial genetic influences on individual differences in general and specific cognitive abilities, especially in adults. The actual localization and identification of genes underlying variation in cognitive abilities and intelligence has only just started, however. Successes are currently limited to neurological mutations with rather severe cognitive effects. The current approaches to trace genes responsible for variation in the normal ranges of cognitive ability consist of large scale linkage and association studies. These are hampered by the usual problems of low statistical power to detect quantitative trait loci (QTLs) of small effect. One strategy to boost the power of genomic searches is to employ endophenotypes of cognition derived from the booming field of cognitive neuroscience This special issue of Behavior Genetics reports on one of the first genome-wide association studies for general IQ. A second paper summarizes candidate genes for cognition, based on animal studies. A series of papers then introduces two additional levels of analysis in the ldquoblack boxrdquo between genes and cognitive ability: (1) behavioral measures of information-processing speed (inspection time, reaction time, rapid naming) and working memory capacity (performance on on single or dual tasks of verbal and spatio-visual working memory), and (2) electrophyiosological derived measures of brain function (e.g., event-related potentials). The obvious way to assess the reliability and validity of these endophenotypes and their usefulness in the search for cognitive ability genes is through the examination of their genetic architecture in twin family studies. Papers in this special issue show that much of the association between intelligence and speed-of-information processing/brain function is due to a common gene or set of genes, and thereby demonstrate the usefulness of considering these measures in gene-hunting studies for IQ.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

1. Schizophrenia is a chronic, disabling brain disease that affects approxmately 1% of the world's population. It is characterized by delusions, hallucinations and formal thought disorder, together with a decline in socio-occupational functioning. While the causes for schizophrenia remain unknown, evidence from family, twin and adoption studies clearly demonstrates that it aggregates in families, with this clustering largely attributable to genetic rather than cultural or environmental factors. Identifying the genes involved, however, has proven to be a difficult task because schizophrenia is a complex trait characterized by an imprecise phenotype, the existence of phenocopies and the presence of low disease penetrance, 2. The current working hypothesis for schizophrenia causation is that multiple genes of small to moderate effect confer compounding risk through interactions with each other and with non-genetic risk factors, The same genes may be commonly involved in conferring risk across populations or they may vary in number and strength between different populations. To search for evidence of such genetic loci, both candidate gene and genome-wide linkage studies have been used in clinical cohorts collected from a variety of populations. Collectively, these works provide some evidence for the involvement of a number of specific genes (e.g. the 5-hydroxytryptamine (5-HT) type 2a receptor (5-HT2a) gene and the dopamine D-3 receptor gene) and as yet unidentified factors localized to specific chromosomal regions, including 6p, 6q, 8p, 13q and 22q, These data provide suggestive, but no conclusive, evidence for causative genes. 3. To enable further progress there is a need to: (i) collect fine-grained clinical datasets while searching the schizophrenia phenotype for subgroups or dimensions that may provide a more direct route to causative genes; and (ii) integrate recent refinements in molecular genetic technology, including modern composite marker maps, DNA expression assays and relevant animal models, while using the latest analytical techniques to extract maximum information in order to help distinguish a true result from a false-positive finding.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Biometrical genetics is the science concerned with the inheritance of quantitative traits. In this review we discuss how the analytical methods of biometrical genetics are based upon simple Mendelian principles. We demonstrate how the phenotypic covariance between related individuals provides information on the relative importance of genetic and environmental factors influencing that trait, and how factors such as assortative mating, gene-environment correlation and genotype-environment interaction complicate such interpretations. Twin and adoption studies are discussed as well as their assumptions and limitations. Structural equation modeling (SEM) is introduced and we illustrate how this approach may be applied to genetic problems. In particular, we show how SEM can be used to address complicated issues such as analyzing the causes of correlation between traits or determining the direction of causation (DOC) between variables. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Genetic research on risk of alcohol, tobacco or drug dependence must make allowance for the partial overlap of risk-factors for initiation of use, and risk-factors for dependence or other outcomes in users. Except in the extreme cases where genetic and environmental risk-factors for initiation and dependence overlap completely or are uncorrelated, there is no consensus about how best to estimate the magnitude of genetic or environmental correlations between Initiation and Dependence in twin and family data. We explore by computer simulation the biases to estimates of genetic and environmental parameters caused by model misspecification when Initiation can only be defined as a binary variable. For plausible simulated parameter values, the two-stage genetic models that we consider yield estimates of genetic and environmental variances for Dependence that, although biased, are not very discrepant from the true values. However, estimates of genetic (or environmental) correlations between Initiation and Dependence may be seriously biased, and may differ markedly under different two-stage models. Such estimates may have little credibility unless external data favor selection of one particular model. These problems can be avoided if Initiation can be assessed as a multiple-category variable (e.g. never versus early-onset versus later onset user), with at least two categories measurable in users at risk for dependence. Under these conditions, under certain distributional assumptions., recovery of simulated genetic and environmental correlations becomes possible, Illustrative application of the model to Australian twin data on smoking confirmed substantial heritability of smoking persistence (42%) with minimal overlap with genetic influences on initiation.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Previous studies have shown a significant effect of insulin administration on serum dehydroepiandrosterone sulfate (DHEA-S) concentration and its metabolic rate, with evidence for the effect in men, but not in women. This could lead to differences in the sources of variation in serum DHEA-S between men and women and in its covariation with insulin concentration. This study aimed to test whether these hypotheses were supported in a sample of healthy adult twins. Serum DHEA-S (n=2287) and plasma insulin (n=2436) were measured in samples from adult male and female twins recruited through the Australian Twin Registry. Models of genetic and environmental sources of variation and covariation were tested against the data. DHEA-S showed substantial genetic effects in both men and women after adjustment for covariates, including sex, age, body mass index, and time since the last meal. There was no significant phenotypic or genetic correlation between DHEA-S and insulin in either men or women. Despite the experimental evidence for insulin infusion producing a reduction in serum DHEA-S and some effect of meals on the observed DHEA-S concentration, there were no associations between insulin and DHEA-S at the population level. Variations in DHEA-S are due to age, sex, obesity, and substantial polygenic genetic influences.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Dissertation presented in fulfillment of the requirements for the Degree of Doctor of Philosophy in Biology (Molecular Genetics) at the Instituto de Tecnologia Química e Biológica da Universidade Nova de Lisboa