943 resultados para association study
Resumo:
As the development of genotyping and next-generation sequencing technologies, multi-marker testing in genome-wide association study and rare variant association study became active research areas in statistical genetics. This dissertation contains three methodologies for association study by exploring different genetic data features and demonstrates how to use those methods to test genetic association hypothesis. The methods can be categorized into in three scenarios: 1) multi-marker testing for strong Linkage Disequilibrium regions, 2) multi-marker testing for family-based association studies, 3) multi-marker testing for rare variant association study. I also discussed the advantage of using these methods and demonstrated its power by simulation studies and applications to real genetic data.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed, but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence, the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association (STREGA) studies initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed, but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information into the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the STrengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and issues of data volume that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
BACKGROUND: HIV-infected individuals have an increased risk of myocardial infarction. Antiretroviral therapy (ART) is regarded as a major determinant of dyslipidemia in HIV-infected individuals. Previous genetic studies have been limited by the validity of the single-nucleotide polymorphisms (SNPs) interrogated and by cross-sectional design. Recent genome-wide association studies have reliably associated common SNPs to dyslipidemia in the general population. METHODS AND RESULTS: We validated the contribution of 42 SNPs (33 identified in genome-wide association studies and 9 previously reported SNPs not included in genome-wide association study chips) and of longitudinally measured key nongenetic variables (ART, underlying conditions, sex, age, ethnicity, and HIV disease parameters) to dyslipidemia in 745 HIV-infected study participants (n=34 565 lipid measurements; median follow-up, 7.6 years). The relative impact of SNPs and ART to lipid variation in the study population and their cumulative influence on sustained dyslipidemia at the level of the individual were calculated. SNPs were associated with lipid changes consistent with genome-wide association study estimates. SNPs explained up to 7.6% (non-high-density lipoprotein cholesterol), 6.2% (high-density lipoprotein cholesterol), and 6.8% (triglycerides) of lipid variation; ART explained 3.9% (non-high-density lipoprotein cholesterol), 1.5% (high-density lipoprotein cholesterol), and 6.2% (triglycerides). An individual with the most dyslipidemic antiretroviral and genetic background had an approximately 3- to 5-fold increased risk of sustained dyslipidemia compared with an individual with the least dyslipidemic therapy and genetic background. CONCLUSIONS: In the HIV-infected population treated with ART, the weight of the contribution of common SNPs and ART to dyslipidemia was similar. When selecting an ART regimen, genetic information should be considered in addition to the dyslipidemic effects of ART agents.
Resumo:
A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.
Resumo:
In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^
Resumo:
Common bean is a major dietary component in several countries, but its productivity is negatively affected by abiotic stresses. Dissecting candidate genes involved in abiotic stress tolerance is a paramount step toward the improvement of common bean performance under such constraints. Thereby, this thesis presents a systematic analysis of the DEHYDRATION RESPONSIVE ELEMENT-BINDING (DREB) gene subfamily, which encompasses genes that regulate several processes during stress responses, but with limited information for common bean. First, a series of in silico analyses with sequences retrieved from the P. vulgaris genome on Phytozome supported the categorization of 54 putative PvDREB genes distributed within six phylogenetic subgroups (A-1 to A-6), along the 11 chromosomes. Second, we cloned four novel PvDREB genes and determined their inducibility-factors, including the dehydration-, salinity- and cold-inducible genes PvDREB1F and PvDREB5A, and the dehydration- and cold-inducible genes PvDREB2A and PvDREB6B. Afterwards, nucleotide polymorphisms were searched through Sanger sequencing along those genes, revealing a high number of single nucleotide polymorphisms within PvDREB6B by the comparison of Mesoamerican and Andean genotypes. The nomenclature of PvDREB6B is discussed in details. Furthermore, we used the BARCBean6K_3 SNP platform to identify and genotype the closest SNP to each one of the 54 PvDREB genes. We selected PvDREB6B for a broader study encompassing a collection of wild common bean accessions of Mesoamerican origin. The population structure of the wild beans was accessed using sequence polymorphisms of PvDREB6B. The genetic clusters were partially associated with variation in latitude, altitude, precipitation and temperature throughout the areas such beans are distributed. With an emphasis on drought stress, an adapted tube-screening method in greenhouse conditions enabled the phenotyping of several drought-related traits in the wild collection. Interestingly, our data revealed a correlation between root depth, plant height and biomass and the environmental data of the location of the accessions. Correlation was also observed between the population structure determined through PvDREB6B and the environmental data. An association study combining data from the SNP array and DREB polymorphisms enabled the detection of SNP associated with drought-related traits through a compressed mixed linear model (CMLM) analysis. This thesis highlighted important features of DREB genes in common bean, revealing candidates for further strategies aimed at improvement of abiotic stress tolerance, with emphasis on drought tolerance
Resumo:
Recent studies suggest an association between the Interferon Inducible Transmembrane 3 (IFITM3) rs12252 variant and the course of influenza infection. However, it is not clear whether the reported association relates to influenza infection severity. The aim of this study was to estimate the hospitalization risk associated with this variant in Influenza Like Illness (ILI) patients during the H1N1 pandemic influenza. A case-control genetic association study was performed, using nasopharyngeal/oropharyngeal swabs collected during the H1N1 pandemic influenza. Laboratory diagnosis of influenza infection was performed by RT-PCR, the IFITM3 rs12252 was genotyped by RFLP and tested for association with hospitalization. Conditional logistic regression was performed to calculate the confounder-adjusted odds ratio of hospitalization associated with IFITM3 rs12252. We selected 312 ILI cases and 624 matched non-hospitalized controls. Within ILI Influenza A(H1N1)pdm09 positive patients, no statistical significant association was found between the variant and the hospitalization risk (Adjusted OR: 0.73 (95%CI: 0.33–1.50)). Regarding ILI Influenza A(H1N1)pdm09 negative patients, CT/CC genotype carriers had a higher risk of being hospitalized than patients with TT genotype (Adjusted OR: 2.54 (95%CI: 1.54–4.19)). The IFITM3 rs12252 variant was associated with respiratory infection hospitalization but not specifically in patients infected with Influenza A(H1N1)pdm09.
Resumo:
Background. The impact of human genetic background on low-trauma fracture (LTF) risk has not been evaluated in the context of human immunodeficiency virus (HIV) and clinical LTF risk factors. Methods. In the general population, 6 common single-nucleotide polymorphisms (SNPs) associate with LTF through genome-wide association study. Using genome-wide SNP arrays and imputation, we genotyped these SNPs in HIV-positive, white Swiss HIV Cohort Study participants. We included 103 individuals with a first, physician-validated LTF and 206 controls matched on gender, whose duration of observation and whose antiretroviral therapy start dates were similar using incidence density sampling. Analyses of nongenetic LTF risk factors were based on 158 cases and 788 controls. Results. A genetic risk score built from the 6 LTF-associated SNPs did not associate with LTF risk, in both models including and not including parental hip fracture history. The contribution of clinical LTF risk factors was limited in our dataset. Conclusions. Genetic LTF markers with a modest effect size in the general population do not improve fracture prediction in persons with HIV, in whom clinical LTF risk factors are prevalent in both cases and controls.