18 resultados para SNP genotyping
em Duke University
Resumo:
Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analytic methodologies. Obstacles to analysis include inference in the face of multiple comparisons, complications arising from correlations among the SNPs (single nucleotide polymorphisms), choice of their genetic parametrization and missing data. In this paper we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations, MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level, with the prior distribution on SNP inclusion in the model providing an intrinsic multiplicity correction. We use simulated data sets to characterize MISA's statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally "validated" in independent studies. We examine sensitivity of the NCOCS results to prior choice and method for imputing missing data. MISA is available in an R package on CRAN.
Resumo:
There is great interindividual variability in HIV-1 viral setpoint after seroconversion, some of which is known to be due to genetic differences among infected individuals. Here, our focus is on determining, genome-wide, the contribution of variable gene expression to viral control, and to relate it to genomic DNA polymorphism. RNA was extracted from purified CD4+ T-cells from 137 HIV-1 seroconverters, 16 elite controllers, and 3 healthy blood donors. Expression levels of more than 48,000 mRNA transcripts were assessed by the Human-6 v3 Expression BeadChips (Illumina). Genome-wide SNP data was generated from genomic DNA using the HumanHap550 Genotyping BeadChip (Illumina). We observed two distinct profiles with 260 genes differentially expressed depending on HIV-1 viral load. There was significant upregulation of expression of interferon stimulated genes with increasing viral load, including genes of the intrinsic antiretroviral defense. Upon successful antiretroviral treatment, the transcriptome profile of previously viremic individuals reverted to a pattern comparable to that of elite controllers and of uninfected individuals. Genome-wide evaluation of cis-acting SNPs identified genetic variants modulating expression of 190 genes. Those were compared to the genes whose expression was found associated with viral load: expression of one interferon stimulated gene, OAS1, was found to be regulated by a SNP (rs3177979, p = 4.9E-12); however, we could not detect an independent association of the SNP with viral setpoint. Thus, this study represents an attempt to integrate genome-wide SNP signals with genome-wide expression profiles in the search for biological correlates of HIV-1 control. It underscores the paradox of the association between increasing levels of viral load and greater expression of antiviral defense pathways. It also shows that elite controllers do not have a fully distinctive mRNA expression pattern in CD4+ T cells. Overall, changes in global RNA expression reflect responses to viral replication rather than a mechanism that might explain viral control.
Resumo:
BACKGROUND: Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study's analysis plan. RESULTS: We developed a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS: We show how diverse functional annotations can be efficiently combined to create 'functional signatures' that predict the a priori odds of a variant's association to a trait and how these signatures can be integrated into a standard genome-wide-scale association analysis, resulting in improved power to detect truly associated variants.
Resumo:
We performed a whole-genome association study of human immunodeficiency virus type 1 (HIV-1) set point among a cohort of African Americans (n = 515), and an intronic single-nucleotide polymorphism (SNP) in the HLA-B gene showed one of the strongest associations. We use a subset of patients to demonstrate that this SNP reflects the effect of the HLA-B*5703 allele, which shows a genome-wide statistically significant association with viral load set point (P = 5.6 x 10(-10)). These analyses therefore confirm a member of the HLA-B*57 group of alleles as the most important common variant that influences viral load variation in African Americans, which is consistent with what has been observed for individuals of European ancestry, among whom the most important common variant is HLA-B*5701.
Resumo:
Lipoprotein-associated phospholipase A(2) (Lp-PLA(2)) is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA(2) activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA(2) activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA(2) activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6 x 10(-24)); CELSR2/PSRC1 on chromosome 1 (p = 3 x 10(-15)); SCARB1 on chromosome 12 (p = 1x10(-8)) and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4 x 10(-8)). All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA(2) mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA(2) activity and mass.
Resumo:
We present a novel strategy that uses high-throughput methods of isolating and mapping C. elegans mutants susceptible to pathogen infection. We show that C. elegans mutants that exhibit an enhanced pathogen accumulation (epa) phenotype can be rapidly identified and isolated using a sorting system that allows automation of the analysis, sorting, and dispensing of C. elegans by measuring fluorescent bacteria inside the animals. Furthermore, we validate the use of Amplifluor as a new single nucleotide polymorphism (SNP) mapping technique in C. elegans. We show that a set of 9 SNPs allows the linkage of C. elegans mutants to a 5-8 megabase sub-chromosomal region.
Resumo:
BACKGROUND: We previously identified a panel of genes associated with outcome of ovarian cancer. The purpose of the current study was to assess whether variants in these genes correlated with ovarian cancer risk. METHODS AND FINDINGS: Women with and without invasive ovarian cancer (749 cases, 1,041 controls) were genotyped at 136 single nucleotide polymorphisms (SNPs) within 13 candidate genes. Risk was estimated for each SNP and for overall variation within each gene. At the gene-level, variation within MSL1 (male-specific lethal-1 homolog) was associated with risk of serous cancer (p = 0.03); haplotypes within PRPF31 (PRP31 pre-mRNA processing factor 31 homolog) were associated with risk of invasive disease (p = 0.03). MSL1 rs7211770 was associated with decreased risk of serous disease (OR 0.81, 95% CI 0.66-0.98; p = 0.03). SNPs in MFSD7, BTN3A3, ZNF200, PTPRS, and CCND1A were inversely associated with risk (p<0.05), and there was increased risk at HEXIM1 rs1053578 (p = 0.04, OR 1.40, 95% CI 1.02-1.91). CONCLUSIONS: Tumor studies can reveal novel genes worthy of follow-up for cancer susceptibility. Here, we found that inherited markers in the gene encoding MSL1, part of a complex that modifies the histone H4, may decrease risk of invasive serous ovarian cancer.
Resumo:
The fungal species Cryptococcus neoformans and Cryptococcus gattii cause respiratory and neurological disease in animals and humans following inhalation of basidiospores or desiccated yeast cells from the environment. Sexual reproduction in C. neoformans and C. gattii is controlled by a bipolar system in which a single mating type locus (MAT) specifies compatibility. These two species are dimorphic, growing as yeast in the asexual stage, and producing hyphae, basidia, and basidiospores during the sexual stage. In contrast, Filobasidiella depauperata, one of the closest related species, grows exclusively as hyphae and it is found in association with decaying insects. Examination of two available strains of F. depauperata showed that the life cycle of this fungal species shares features associated with the unisexual or same-sex mating cycle in C. neoformans. Therefore, F. depauperata may represent a homothallic and possibly an obligately sexual fungal species. RAPD genotyping of 39 randomly isolated progeny from isolate CBS7855 revealed a new genotype pattern in one of the isolated basidiospores progeny, therefore suggesting that the homothallic cycle in F. depauperata could lead to the emergence of new genotypes. Phylogenetic analyses of genes linked to MAT in C. neoformans indicated that two of these genes in F. depauperata, MYO2 and STE20, appear to form a monophyletic clade with the MATa alleles of C. neoformans and C. gattii, and thus these genes may have been recruited to the MAT locus before F. depauperata diverged. Furthermore, the ancestral MATa locus may have undergone accelerated evolution prior to the divergence of the pathogenic Cryptococcus species since several of the genes linked to the MATa locus appear to have a higher number of changes and substitutions than their MATalpha counterparts. Synteny analyses between C. neoformans and F. depauperata showed that genomic regions on other chromosomes displayed conserved gene order. In contrast, the genes linked to the MAT locus of C. neoformans showed a higher number of chromosomal translocations in the genome of F. depauperata. We therefore propose that chromosomal rearrangements appear to be a major force driving speciation and sexual divergence in these closely related pathogenic and saprobic species.
Resumo:
Like human immunodeficiency virus type 1 (HIV-1), simian immunodeficiency virus of chimpanzees (SIVcpz) can cause CD4+ T cell loss and premature death. Here, we used molecular surveillance tools and mathematical modeling to estimate the impact of SIVcpz infection on chimpanzee population dynamics. Habituated (Mitumba and Kasekela) and non-habituated (Kalande) chimpanzees were studied in Gombe National Park, Tanzania. Ape population sizes were determined from demographic records (Mitumba and Kasekela) or individual sightings and genotyping (Kalande), while SIVcpz prevalence rates were monitored using non-invasive methods. Between 2002-2009, the Mitumba and Kasekela communities experienced mean annual growth rates of 1.9% and 2.4%, respectively, while Kalande chimpanzees suffered a significant decline, with a mean growth rate of -6.5% to -7.4%, depending on population estimates. A rapid decline in Kalande was first noted in the 1990s and originally attributed to poaching and reduced food sources. However, between 2002-2009, we found a mean SIVcpz prevalence in Kalande of 46.1%, which was almost four times higher than the prevalence in Mitumba (12.7%) and Kasekela (12.1%). To explore whether SIVcpz contributed to the Kalande decline, we used empirically determined SIVcpz transmission probabilities as well as chimpanzee mortality, mating and migration data to model the effect of viral pathogenicity on chimpanzee population growth. Deterministic calculations indicated that a prevalence of greater than 3.4% would result in negative growth and eventual population extinction, even using conservative mortality estimates. However, stochastic models revealed that in representative populations, SIVcpz, and not its host species, frequently went extinct. High SIVcpz transmission probability and excess mortality reduced population persistence, while intercommunity migration often rescued infected communities, even when immigrating females had a chance of being SIVcpz infected. Together, these results suggest that the decline of the Kalande community was caused, at least in part, by high levels of SIVcpz infection. However, population extinction is not an inevitable consequence of SIVcpz infection, but depends on additional variables, such as migration, that promote survival. These findings are consistent with the uneven distribution of SIVcpz throughout central Africa and explain how chimpanzees in Gombe and elsewhere can be at equipoise with this pathogen.
Resumo:
Alzheimer's disease is a complex and progressive neurodegenerative disease leading to loss of memory, cognitive impairment, and ultimately death. To date, six large-scale genome-wide association studies have been conducted to identify SNPs that influence disease predisposition. These studies have confirmed the well-known APOE epsilon4 risk allele, identified a novel variant that influences disease risk within the APOE epsilon4 population, found a SNP that modifies the age of disease onset, as well as reported the first sex-linked susceptibility variant. Here we report a genome-wide scan of Alzheimer's disease in a set of 331 cases and 368 controls, extending analyses for the first time to include assessments of copy number variation. In this analysis, no new SNPs show genome-wide significance. We also screened for effects of copy number variation, and while nothing was significant, a duplication in CHRNA7 appears interesting enough to warrant further investigation.
Resumo:
PURPOSE: Evaluating genetic susceptibility may clarify effects of known environmental factors and also identify individuals at high risk. We evaluated the association of four insulin-related pathway gene polymorphisms in insulin-like growth factor-1 (IGF-I) (CA)( n ) repeat, insulin-like growth factor-2 (IGF-II) (rs680), insulin-like growth factor-binding protein-3 (IGFBP-3) (rs2854744), and adiponectin (APM1 rs1501299) with colon cancer risk, as well as relationships with circulating IGF-I, IGF-II, IGFBP-3, and C-peptide in a population-based study. METHODS: Participants were African Americans (231 cases and 306 controls) and Whites (297 cases, 530 controls). Consenting subjects provided blood specimens and lifestyle/diet information. Genotyping for all genes except IGF-I was performed by the 5'-exonuclease (Taqman) assay. The IGF-I (CA)(n) repeat was assayed by PCR and fragment analysis. Circulating proteins were measured by enzyme immunoassays. Odds ratios (ORs) and 95 % confidence intervals (CIs) were calculated by logistic regression. RESULTS: The IGF-I (CA)( 19 ) repeat was higher in White controls (50 %) than African American controls (31 %). Whites homozygous for the IGF-I (CA)(19) repeat had a nearly twofold increase in risk of colon cancer (OR = 1.77; 95 % CI = 1.15-2.73), but not African Americans (OR = 0.73, 95 % CI 0.50-1.51). We observed an inverse association between the IGF-II Apa1 A-variant and colon cancer risk (OR = 0.49, 95 % CI 0.28-0.88) in Whites only. Carrying the IGFBP-3 variant alleles was associated with lower IGFBP-3 protein levels, a difference most pronounced in Whites (p-trend <0.05). CONCLUSIONS: These results support an association between insulin pathway-related genes and elevated colon cancer risk in Whites but not in African Americans.
Resumo:
BACKGROUND: We have previously shown that a functional polymorphism of the UGT2B15 gene (rs1902023) was associated with increased risk of prostate cancer (PC). Novel functional polymorphisms of the UGT2B17 and UGT2B15 genes have been recently characterized by in vitro assays but have not been evaluated in epidemiologic studies. METHODS: Fifteen functional SNPs of the UGT2B17 and UGT2B15 genes, including cis-acting UGT2B gene SNPs, were genotyped in African American and Caucasian men (233 PC cases and 342 controls). Regression models were used to analyze the association between SNPs and PC risk. RESULTS: After adjusting for race, age and BMI, we found that six UGT2B15 SNPs (rs4148269, rs3100, rs9994887, rs13112099, rs7686914 and rs7696472) were associated with an increased risk of PC in log-additive models (p < 0.05). A SNP cis-acting on UGT2B17 and UGT2B15 expression (rs17147338) was also associated with increased risk of prostate cancer (OR = 1.65, 95% CI = 1.00-2.70); while a stronger association among men with high Gleason sum was observed for SNPs rs4148269 and rs3100. CONCLUSIONS: Although small sample size limits inference, we report novel associations between UGT2B15 and UGT2B17 variants and PC risk. These associations with PC risk in men with high Gleason sum, more frequently found in African American men, support the relevance of genetic differences in the androgen metabolism pathway, which could explain, in part, the high incidence of PC among African American men. Larger studies are required.
Resumo:
During mitotic cell cycles, DNA experiences many types of endogenous and exogenous damaging agents that could potentially cause double strand breaks (DSB). In S. cerevisiae, DSBs are primarily repaired by mitotic recombination and as a result, could lead to loss-of-heterozygosity (LOH). Genetic recombination can happen in both meiosis and mitosis. While genome-wide distribution of meiotic recombination events has been intensively studied, mitotic recombination events have not been mapped unbiasedly throughout the genome until recently. Methods for selecting mitotic crossovers and mapping the positions of crossovers have recently been developed in our lab. Our current approach uses a diploid yeast strain that is heterozygous for about 55,000 SNPs, and employs SNP-Microarrays to map LOH events throughout the genome. These methods allow us to examine selected crossovers and unselected mitotic recombination events (crossover, noncrossover and BIR) at about 1 kb resolution across the genome. Using this method, we generated maps of spontaneous and UV-induced LOH events. In this study, we explore machine learning and variable selection techniques to build a predictive model for where the LOH events occur in the genome.
Randomly from the yeast genome, we simulated control tracts resembling the LOH tracts in terms of tract lengths and locations with respect to single-nucleotide-polymorphism positions. We then extracted roughly 1,100 features such as base compositions, histone modifications, presence of tandem repeats etc. and train classifiers to distinguish control tracts and LOH tracts. We found interesting features of good predictive values. We also found that with the current repertoire of features, the prediction is generally better for spontaneous LOH events than UV-induced LOH events.
Resumo:
CD133 is one of the most common stem cell markers, and functional single nucleotide polymorphisms (SNPs) of CD133 may modulate its gene functions and thus cancer risk and patient survival. We hypothesized that potentially functional CD133 SNPs are associated with gastric cancer (GC) risk and survival. To test this hypothesis, we conducted a case-control study of 371 GC patients and 313 cancer-free controls frequency-matched by age, sex, and ethnicity. We genotyped four selected, potentially functional CD133 SNPs (rs2240688A>C, rs7686732C>G, rs10022537T>A, and rs3130C>T) and used logistic regression analysis for associations of these SNPs with GC risk and Cox hazards regression analysis for survival. We found that compared with the miRNA binding site rs2240688 AA genotype, AC + CC genotypes were associated with significantly increased GC risk (adjusted OR = 1.52, 95% CI = 1.09-2.13); for another miRNA binding site rs3130C>T SNP, the TT genotype was associated with significantly reduced GC risk (adjusted OR = 0.68, 95% CI = 0.48-0.97), compared with CC + CT genotypes. In all patients, the risk rs3130 TT variant genotype was significantly associated with overall survival (OS) (adjusted P(trend) = 0.016 and 0.007 under additive and recessive models, respectively). These findings suggest that these two CD133 miRNA binding site variants, rs2240688 and rs3130, may be potential biomarkers for genetic susceptibility to GC and possible predictors for survival in GC patients but require further validation by larger studies.
Resumo:
Genome-wide association studies (GWASs) have characterized 13 loci associated with melanoma, which only account for a small part of melanoma risk. To identify new genes with too small an effect to be detected individually but which collectively influence melanoma risk and/or show interactive effects, we used a two-step analysis strategy including pathway analysis of genome-wide SNP data, in a first step, and epistasis analysis within significant pathways, in a second step. Pathway analysis, using the gene-set enrichment analysis (GSEA) approach and the gene ontology (GO) database, was applied to the outcomes of MELARISK (3,976 subjects) and MDACC (2,827 subjects) GWASs. Cross-gene SNP-SNP interaction analysis within melanoma-associated GOs was performed using the INTERSNP software. Five GO categories were significantly enriched in genes associated with melanoma (false discovery rate ≤ 5% in both studies): response to light stimulus, regulation of mitotic cell cycle, induction of programmed cell death, cytokine activity and oxidative phosphorylation. Epistasis analysis, within each of the five significant GOs, showed significant evidence for interaction for one SNP pair at TERF1 and AFAP1L2 loci (pmeta-int = 2.0 × 10(-7) , which met both the pathway and overall multiple-testing corrected thresholds that are equal to 9.8 × 10(-7) and 2.0 × 10(-7) , respectively) and suggestive evidence for another pair involving correlated SNPs at the same loci (pmeta-int = 3.6 × 10(-6) ). This interaction has important biological relevance given the key role of TERF1 in telomere biology and the reported physical interaction between TERF1 and AFAP1L2 proteins. This finding brings a novel piece of evidence for the emerging role of telomere dysfunction into melanoma development.