937 resultados para Single-nucleotide Polymorphism
Resumo:
BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.
Resumo:
Genome-wide association studies (GWAS) have now identified at least 2,000 common variants that appear associated with common diseases or related traits (http://www.genome.gov/gwastudies), hundreds of which have been convincingly replicated. It is generally thought that the associated markers reflect the effect of a nearby common (minor allele frequency >0.05) causal site, which is associated with the marker, leading to extensive resequencing efforts to find causal sites. We propose as an alternative explanation that variants much less common than the associated one may create "synthetic associations" by occurring, stochastically, more often in association with one of the alleles at the common site versus the other allele. Although synthetic associations are an obvious theoretical possibility, they have never been systematically explored as a possible explanation for GWAS findings. Here, we use simple computer simulations to show the conditions under which such synthetic associations will arise and how they may be recognized. We show that they are not only possible, but inevitable, and that under simple but reasonable genetic models, they are likely to account for or contribute to many of the recently identified signals reported in genome-wide association studies. We also illustrate the behavior of synthetic associations in real datasets by showing that rare causal mutations responsible for both hearing loss and sickle cell anemia create genome-wide significant synthetic associations, in the latter case extending over a 2.5-Mb interval encompassing scores of "blocks" of associated variants. In conclusion, uncommon or rare genetic variants can easily create synthetic associations that are credited to common variants, and this possibility requires careful consideration in the interpretation and follow up of GWAS signals.
Resumo:
To extend the understanding of host genetic determinants of HIV-1 control, we performed a genome-wide association study in a cohort of 2,554 infected Caucasian subjects. The study was powered to detect common genetic variants explaining down to 1.3% of the variability in viral load at set point. We provide overwhelming confirmation of three associations previously reported in a genome-wide study and show further independent effects of both common and rare variants in the Major Histocompatibility Complex region (MHC). We also examined the polymorphisms reported in previous candidate gene studies and fail to support a role for any variant outside of the MHC or the chemokine receptor cluster on chromosome 3. In addition, we evaluated functional variants, copy-number polymorphisms, epistatic interactions, and biological pathways. This study thus represents a comprehensive assessment of common human genetic variation in HIV-1 control in Caucasians.
Resumo:
To investigate the underlying mechanisms of T2D pathogenesis, we looked for diabetes susceptibility genes that increase the risk of type 2 diabetes (T2D) in a Han Chinese population. A two-stage genome-wide association (GWA) study was conducted, in which 995 patients and 894 controls were genotyped using the Illumina HumanHap550-Duo BeadChip for the first genome scan stage. This was further replicated in 1,803 patients and 1,473 controls in stage 2. We found two loci not previously associated with diabetes susceptibility in and around the genes protein tyrosine phosphatase receptor type D (PTPRD) (P = 8.54x10(-10); odds ratio [OR] = 1.57; 95% confidence interval [CI] = 1.36-1.82), and serine racemase (SRR) (P = 3.06x10(-9); OR = 1.28; 95% CI = 1.18-1.39). We also confirmed that variants in KCNQ1 were associated with T2D risk, with the strongest signal at rs2237895 (P = 9.65x10(-10); OR = 1.29, 95% CI = 1.19-1.40). By identifying two novel genetic susceptibility loci in a Han Chinese population and confirming the involvement of KCNQ1, which was previously reported to be associated with T2D in Japanese and European descent populations, our results may lead to a better understanding of differences in the molecular pathogenesis of T2D among various populations.
Resumo:
Lipoprotein-associated phospholipase A(2) (Lp-PLA(2)) is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA(2) activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA(2) activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA(2) activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6 x 10(-24)); CELSR2/PSRC1 on chromosome 1 (p = 3 x 10(-15)); SCARB1 on chromosome 12 (p = 1x10(-8)) and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4 x 10(-8)). All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA(2) mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA(2) activity and mass.
Resumo:
We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.
Resumo:
BACKGROUND: We previously identified a panel of genes associated with outcome of ovarian cancer. The purpose of the current study was to assess whether variants in these genes correlated with ovarian cancer risk. METHODS AND FINDINGS: Women with and without invasive ovarian cancer (749 cases, 1,041 controls) were genotyped at 136 single nucleotide polymorphisms (SNPs) within 13 candidate genes. Risk was estimated for each SNP and for overall variation within each gene. At the gene-level, variation within MSL1 (male-specific lethal-1 homolog) was associated with risk of serous cancer (p = 0.03); haplotypes within PRPF31 (PRP31 pre-mRNA processing factor 31 homolog) were associated with risk of invasive disease (p = 0.03). MSL1 rs7211770 was associated with decreased risk of serous disease (OR 0.81, 95% CI 0.66-0.98; p = 0.03). SNPs in MFSD7, BTN3A3, ZNF200, PTPRS, and CCND1A were inversely associated with risk (p<0.05), and there was increased risk at HEXIM1 rs1053578 (p = 0.04, OR 1.40, 95% CI 1.02-1.91). CONCLUSIONS: Tumor studies can reveal novel genes worthy of follow-up for cancer susceptibility. Here, we found that inherited markers in the gene encoding MSL1, part of a complex that modifies the histone H4, may decrease risk of invasive serous ovarian cancer.
Resumo:
There is great interindividual variability in HIV-1 viral setpoint after seroconversion, some of which is known to be due to genetic differences among infected individuals. Here, our focus is on determining, genome-wide, the contribution of variable gene expression to viral control, and to relate it to genomic DNA polymorphism. RNA was extracted from purified CD4+ T-cells from 137 HIV-1 seroconverters, 16 elite controllers, and 3 healthy blood donors. Expression levels of more than 48,000 mRNA transcripts were assessed by the Human-6 v3 Expression BeadChips (Illumina). Genome-wide SNP data was generated from genomic DNA using the HumanHap550 Genotyping BeadChip (Illumina). We observed two distinct profiles with 260 genes differentially expressed depending on HIV-1 viral load. There was significant upregulation of expression of interferon stimulated genes with increasing viral load, including genes of the intrinsic antiretroviral defense. Upon successful antiretroviral treatment, the transcriptome profile of previously viremic individuals reverted to a pattern comparable to that of elite controllers and of uninfected individuals. Genome-wide evaluation of cis-acting SNPs identified genetic variants modulating expression of 190 genes. Those were compared to the genes whose expression was found associated with viral load: expression of one interferon stimulated gene, OAS1, was found to be regulated by a SNP (rs3177979, p = 4.9E-12); however, we could not detect an independent association of the SNP with viral setpoint. Thus, this study represents an attempt to integrate genome-wide SNP signals with genome-wide expression profiles in the search for biological correlates of HIV-1 control. It underscores the paradox of the association between increasing levels of viral load and greater expression of antiviral defense pathways. It also shows that elite controllers do not have a fully distinctive mRNA expression pattern in CD4+ T cells. Overall, changes in global RNA expression reflect responses to viral replication rather than a mechanism that might explain viral control.
Resumo:
Although it has recently been shown that A/J mice are highly susceptible to Staphylococcus aureus sepsis as compared to C57BL/6J, the specific genes responsible for this differential phenotype are unknown. Using chromosome substitution strains (CSS), we found that loci on chromosomes 8, 11, and 18 influence susceptibility to S. aureus sepsis in A/J mice. We then used two candidate gene selection strategies to identify genes on these three chromosomes associated with S. aureus susceptibility, and targeted genes identified by both gene selection strategies. First, we used whole genome transcription profiling to identify 191 (56 on chr. 8, 100 on chr. 11, and 35 on chr. 18) genes on our three chromosomes of interest that are differentially expressed between S. aureus-infected A/J and C57BL/6J. Second, we identified two significant quantitative trait loci (QTL) for survival post-infection on chr. 18 using N(2) backcross mice (F(1) [C18A]xC57BL/6J). Ten genes on chr. 18 (March3, Cep120, Chmp1b, Dcp2, Dtwd2, Isoc1, Lman1, Spire1, Tnfaip8, and Seh1l) mapped to the two significant QTL regions and were also identified by the expression array selection strategy. Using real-time PCR, 6 of these 10 genes (Chmp1b, Dtwd2, Isoc1, Lman1, Tnfaip8, and Seh1l) showed significantly different expression levels between S. aureus-infected A/J and C57BL/6J. For two (Tnfaip8 and Seh1l) of these 6 genes, siRNA-mediated knockdown of gene expression in S. aureus-challenged RAW264.7 macrophages induced significant changes in the cytokine response (IL-1 beta and GM-CSF) compared to negative controls. These cytokine response changes were consistent with those seen in S. aureus-challenged peritoneal macrophages from CSS 18 mice (which contain A/J chromosome 18 but are otherwise C57BL/6J), but not C57BL/6J mice. These findings suggest that two genes, Tnfaip8 and Seh1l, may contribute to susceptibility to S. aureus in A/J mice, and represent promising candidates for human genetic susceptibility studies.
Resumo:
Alzheimer's disease is a complex and progressive neurodegenerative disease leading to loss of memory, cognitive impairment, and ultimately death. To date, six large-scale genome-wide association studies have been conducted to identify SNPs that influence disease predisposition. These studies have confirmed the well-known APOE epsilon4 risk allele, identified a novel variant that influences disease risk within the APOE epsilon4 population, found a SNP that modifies the age of disease onset, as well as reported the first sex-linked susceptibility variant. Here we report a genome-wide scan of Alzheimer's disease in a set of 331 cases and 368 controls, extending analyses for the first time to include assessments of copy number variation. In this analysis, no new SNPs show genome-wide significance. We also screened for effects of copy number variation, and while nothing was significant, a duplication in CHRNA7 appears interesting enough to warrant further investigation.
Resumo:
Humans have ~400 intact odorant receptors, but each individual has a unique set of genetic variations that lead to variation in olfactory perception. We used a heterologous assay to determine how often genetic polymorphisms in odorant receptors alter receptor function. We identified agonists for 18 odorant receptors and found that 63% of the odorant receptors we examined had polymorphisms that altered in vitro function. On average, two individuals have functional differences at over 30% of their odorant receptor alleles. To show that these in vitro results are relevant to olfactory perception, we verified that variations in OR10G4 genotype explain over 15% of the observed variation in perceived intensity and over 10% of the observed variation in perceived valence for the high-affinity in vitro agonist guaiacol but do not explain phenotype variation for the lower-affinity agonists vanillin and ethyl vanillin.
Resumo:
BACKGROUND: We have previously shown that a functional polymorphism of the UGT2B15 gene (rs1902023) was associated with increased risk of prostate cancer (PC). Novel functional polymorphisms of the UGT2B17 and UGT2B15 genes have been recently characterized by in vitro assays but have not been evaluated in epidemiologic studies. METHODS: Fifteen functional SNPs of the UGT2B17 and UGT2B15 genes, including cis-acting UGT2B gene SNPs, were genotyped in African American and Caucasian men (233 PC cases and 342 controls). Regression models were used to analyze the association between SNPs and PC risk. RESULTS: After adjusting for race, age and BMI, we found that six UGT2B15 SNPs (rs4148269, rs3100, rs9994887, rs13112099, rs7686914 and rs7696472) were associated with an increased risk of PC in log-additive models (p < 0.05). A SNP cis-acting on UGT2B17 and UGT2B15 expression (rs17147338) was also associated with increased risk of prostate cancer (OR = 1.65, 95% CI = 1.00-2.70); while a stronger association among men with high Gleason sum was observed for SNPs rs4148269 and rs3100. CONCLUSIONS: Although small sample size limits inference, we report novel associations between UGT2B15 and UGT2B17 variants and PC risk. These associations with PC risk in men with high Gleason sum, more frequently found in African American men, support the relevance of genetic differences in the androgen metabolism pathway, which could explain, in part, the high incidence of PC among African American men. Larger studies are required.
Resumo:
BACKGROUND: Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study's analysis plan. RESULTS: We developed a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS: We show how diverse functional annotations can be efficiently combined to create 'functional signatures' that predict the a priori odds of a variant's association to a trait and how these signatures can be integrated into a standard genome-wide-scale association analysis, resulting in improved power to detect truly associated variants.
Association between DNA damage response and repair genes and risk of invasive serous ovarian cancer.
Resumo:
BACKGROUND: We analyzed the association between 53 genes related to DNA repair and p53-mediated damage response and serous ovarian cancer risk using case-control data from the North Carolina Ovarian Cancer Study (NCOCS), a population-based, case-control study. METHODS/PRINCIPAL FINDINGS: The analysis was restricted to 364 invasive serous ovarian cancer cases and 761 controls of white, non-Hispanic race. Statistical analysis was two staged: a screen using marginal Bayes factors (BFs) for 484 SNPs and a modeling stage in which we calculated multivariate adjusted posterior probabilities of association for 77 SNPs that passed the screen. These probabilities were conditional on subject age at diagnosis/interview, batch, a DNA quality metric and genotypes of other SNPs and allowed for uncertainty in the genetic parameterizations of the SNPs and number of associated SNPs. Six SNPs had Bayes factors greater than 10 in favor of an association with invasive serous ovarian cancer. These included rs5762746 (median OR(odds ratio)(per allele) = 0.66; 95% credible interval (CI) = 0.44-1.00) and rs6005835 (median OR(per allele) = 0.69; 95% CI = 0.53-0.91) in CHEK2, rs2078486 (median OR(per allele) = 1.65; 95% CI = 1.21-2.25) and rs12951053 (median OR(per allele) = 1.65; 95% CI = 1.20-2.26) in TP53, rs411697 (median OR (rare homozygote) = 0.53; 95% CI = 0.35 - 0.79) in BACH1 and rs10131 (median OR( rare homozygote) = not estimable) in LIG4. The six most highly associated SNPs are either predicted to be functionally significant or are in LD with such a variant. The variants in TP53 were confirmed to be associated in a large follow-up study. CONCLUSIONS/SIGNIFICANCE: Based on our findings, further follow-up of the DNA repair and response pathways in a larger dataset is warranted to confirm these results.
Resumo:
Pharmacogenomics (PGx) offers the promise of utilizing genetic fingerprints to predict individual responses to drugs in terms of safety, efficacy and pharmacokinetics. Early-phase clinical trial PGx applications can identify human genome variations that are meaningful to study design, selection of participants, allocation of resources and clinical research ethics. Results can inform later-phase study design and pipeline developmental decisions. Nevertheless, our review of the clinicaltrials.gov database demonstrates that PGx is rarely used by drug developers. Of the total 323 trials that included PGx as an outcome, 80% have been conducted by academic institutions after initial regulatory approval. Barriers for the application of PGx are discussed. We propose a framework for the role of PGx in early-phase drug development and recommend PGx be universally considered in study design, result interpretation and hypothesis generation for later-phase studies, but PGx results from underpowered studies should not be used by themselves to terminate drug-development programs.