884 resultados para Genome Wide Association Study


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have derived a versatile gene-based test for genome-wide association studies (GWAS). Our approach, called VEGAS (versatile gene-based association study), is applicable to all GWAS designs, including family-based GWAS, meta-analyses of GWAS on the basis of summary data, and DNA-pooling-based GWAS, where existing approaches based on permutation are not possible, as well as singleton data, where they are. The test incorporates information from a full set of markers (or a defined subset) within a gene and accounts for linkage disequilibrium between markers by using simulations from the multivariate normal distribution. We show that for an association study using singletons, our approach produces results equivalent to those obtained via permutation in a fraction of the computation time. We demonstrate proof-of-principle by using the gene-based test to replicate several genes known to be associated on the basis of results from a family-based GWAS for height in 11,536 individuals and a DNA-pooling-based GWAS for melanoma in approximately 1300 cases and controls. Our method has the potential to identify novel associated genes; provide a basis for selecting SNPs for replication; and be directly used in network (pathway) approaches that require per-gene association test statistics. We have implemented the approach in both an easy-to-use web interface, which only requires the uploading of markers with their association p-values, and a separate downloadable application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was approximately 80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We conducted a genome-wide association study testing single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) for association with early-onset myocardial infarction in 2,967 cases and 3,075 controls. We carried out replication in an independent sample with an effective sample size of up to 19,492. SNPs at nine loci reached genome-wide significance: three are newly identified (21q22 near MRPS6-SLC5A3-KCNE2, 6p24 in PHACTR1 and 2q33 in WDR12) and six replicated prior observations1-4 (9p21, 1p13 near CELSR2-PSRC1-SORT1, 10q11 near CXCL12, 1q41 in MIA3, 19p13 near LDLR and 1p32 near PCSK9). We tested 554 common copy number polymorphisms (>1% allele frequency) and none met the pre-specified threshold for replication (P < 10-3). We identified 8,065 rare CNVs but did not detect a greater CNV burden in cases compared to controls, in genes compared to the genome as a whole, or at any individual locus. SNPs at nine loci were reproducibly associated with myocardial infarction, but tests of common and rare CNVs failed to identify additional associations with myocardial infarction risk.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offspring trios). We identified 22 loci associated at genome-wide significance; 13 of these are new, and 1 was previously implicated in bipolar disorder. Examination of candidate genes at these loci suggests the involvement of neuronal calcium signaling. We estimate that 8,300 independent, mostly common SNPs (95% credible interval of 6,300-10,200 SNPs) contribute to risk for schizophrenia and that these collectively account for at least 32% of the variance in liability. Common genetic variation has an important role in the etiology of schizophrenia, and larger studies will allow more detailed understanding of this disorder.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Runs of homozygosity (ROH), regions of the genome containing many consecutive homozygous SNPs, may represent two copies of a haplotype inherited from a common ancestor. A rare variant on this haplotype could thus be present in a homozygous and potentially recessive state. To detect rare risk variants for schizophrenia, we performed an ROH analysis in a homogeneous Irish genome wide association study (GWAS) dataset consisting of 1606 cases and 1794 controls. There was no genome-wide excess of ROH in cases compared to controls (p=0.7986). No consensus ROH at individual loci showed association with schizophrenia after genome-wide correction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our sleep timing preference, or chronotype, is a manifestation of our internal biological clock. Variation in chronotype has been linked to sleep disorders, cognitive and physical performance, and chronic disease. Here we perform a genome-wide association study of self-reported chronotype within the UK Biobank cohort (n=100,420). We identify 12 new genetic loci that implicate known components of the circadian clock machinery and point to previously unstudied genetic variants and candidate genes that might modulate core circadian rhythms or light-sensing pathways. Pathway analyses highlight central nervous and ocular systems and fear-response-related processes. Genetic correlation analysis suggests chronotype shares underlying genetic pathways with schizophrenia, educational attainment and possibly BMI. Further, Mendelian randomization suggests that evening chronotype relates to higher educational attainment. These results not only expand our knowledge of the circadian system in humans but also expose the influence of circadian characteristics over human health and life-history variables such as educational attainment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cauliflower (Brassica oleracea var. botrytis) is a vernalization-responsive crop. High ambient temperatures delay harvest time. The elucidation of the genetic regulation of floral transition is highly interesting for a precise harvest scheduling and to ensure stable market supply. This study aims at genetic dissection of temperature-dependent curd induction in cauliflower by genome-wide association studies and gene expression analysis. To assess temperature dependent curd induction, two greenhouse trials under distinct temperature regimes were conducted on a diversity panel consisting of 111 cauliflower commercial parent lines, genotyped with 14,385 SNPs. Broad phenotypic variation and high heritability (0.93) were observed for temperature-related curd induction within the cauliflower population. GWA mapping identified a total of 18 QTL localized on chromosomes O1, O2, O3, O4, O6, O8, and O9 for curding time under two distinct temperature regimes. Among those, several QTL are localized within regions of promising candidate flowering genes. Inferring population structure and genetic relatedness among the diversity set assigned three main genetic clusters. Linkage disequilibrium (LD) patterns estimated global LD extent of r(2) = 0.06 and a maximum physical distance of 400 kb for genetic linkage. Transcriptional profiling of flowering genes FLOWERING LOCUS C (BoFLC) and VERNALIZATION 2 (BoVRN2) was performed, showing increased expression levels of BoVRN2 in genotypes with faster curding. However, functional relevance of BoVRN2 and BoFLC2 could not consistently be supported, which probably suggests to act facultative and/or might evidence for BoVRN2/BoFLC-independent mechanisms in temperature regulated floral transition in cauliflower. Genetic insights in temperature-regulated curd induction can underpin genetically informed phenology models and benefit molecular breeding strategies toward the development of thermo-tolerant cultivars.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND:Osteoporosis is characterized by low bone mass and compromised bone structure, heritable traits that contribute to fracture risk. There have been no genome-wide association and linkage studies for these traits using high-density genotyping platforms.METHODS:We used the Affymetrix 100K SNP GeneChip marker set in the Framingham Heart Study (FHS) to examine genetic associations with ten primary quantitative traits: bone mineral density (BMD), calcaneal ultrasound, and geometric indices of the hip. To test associations with multivariable-adjusted residual trait values, we used additive generalized estimating equation (GEE) and family-based association tests (FBAT) models within each sex as well as sexes combined. We evaluated 70,987 autosomal SNPs with genotypic call rates [greater than or equal to]80%, HWE p [greater than or equal to] 0.001, and MAF [greater than or equal to]10% in up to 1141 phenotyped individuals (495 men and 646 women, mean age 62.5 yrs). Variance component linkage analysis was performed using 11,200 markers.RESULTS:Heritability estimates for all bone phenotypes were 30-66%. LOD scores [greater than or equal to]3.0 were found on chromosomes 15 (1.5 LOD confidence interval: 51,336,679-58,934,236 bp) and 22 (35,890,398-48,603,847 bp) for femoral shaft section modulus. The ten primary phenotypes had 12 associations with 100K SNPs in GEE models at p < 0.000001 and 2 associations in FBAT models at p < 0.000001. The 25 most significant p-values for GEE and FBAT were all less than 3.5 x 10-6 and 2.5 x 10-5, respectively. Of the 40 top SNPs with the greatest numbers of significantly associated BMD traits (including femoral neck, trochanter, and lumbar spine), one half to two-thirds were in or near genes that have not previously been studied for osteoporosis. Notably, pleiotropic associations between BMD and bone geometric traits were uncommon. Evidence for association (FBAT or GEE p < 0.05) was observed for several SNPs in candidate genes for osteoporosis, such as rs1801133 in MTHFR; rs1884052 and rs3778099 in ESR1; rs4988300 in LRP5; rs2189480 in VDR; rs2075555 in COLIA1; rs10519297 and rs2008691 in CYP19, as well as SNPs in PPARG (rs10510418 and rs2938392) and ANKH (rs2454873 and rs379016). All GEE, FBAT and linkage results are provided as an open-access results resource at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007.CONCLUSION:The FHS 100K SNP project offers an unbiased genome-wide strategy to identify new candidate loci and to replicate previously suggested candidate genes for osteoporosis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adiponectin has a variety of metabolic effects on obesity, insulin sensitivity, and atherosclerosis. To identify genes influencing variation in plasma adiponectin levels, we performed genome-wide linkage and association scans of adiponectin in two cohorts of subjects recruited in the Genetic Epidemiology of Metabolic Syndrome Study. The genome-wide linkage scan was conducted in families of Turkish and southern European (TSE, n = 789) and Northern and Western European (NWE, N = 2,280) origin. A whole genome association (WGA) analysis (500K Affymetrix platform) was carried out in a set of unrelated NWE subjects consisting of approximately 1,000 subjects with dyslipidemia and 1,000 overweight subjects with normal lipids. Peak evidence for linkage occurred at chromosome 8p23 in NWE subjects (lod = 3.10) and at chromosome 3q28 near ADIPOQ, the adiponectin structural gene, in TSE subjects (lod = 1.70). In the WGA analysis, the single-nucleotide polymorphisms (SNPs) most strongly associated with adiponectin were rs3774261 and rs6773957 (P < 10(-7)). These two SNPs were in high linkage disequilibrium (r(2) = 0.98) and located within ADIPOQ. Interestingly, our fourth strongest region of association (P < 2 x 10(-5)) was to an SNP within CDH13, whose protein product is a newly identified receptor for high-molecular-weight species of adiponectin. Through WGA analysis, we confirmed previous studies showing SNPs within ADIPOQ to be strongly associated with variation in adiponectin levels and further observed these to have the strongest effects on adiponectin levels throughout the genome. We additionally identified a second gene (CDH13) possibly influencing variation in adiponectin levels. The impact of these SNPs on health and disease has yet to be determined.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to describe the clinical and PSG characteristics of narcolepsy with cataplexy and their genetic predisposition by using the retrospective patient database of the European Narcolepsy Network (EU-NN). We have analysed retrospective data of 1099 patients with narcolepsy diagnosed according to International Classification of Sleep Disorders-2. Demographic and clinical characteristics, polysomnography and multiple sleep latency test data, hypocretin-1 levels, and genome-wide genotypes were available. We found a significantly lower age at sleepiness onset (men versus women: 23.74 ± 12.43 versus 21.49 ± 11.83, P = 0.003) and longer diagnostic delay in women (men versus women: 13.82 ± 13.79 versus 15.62 ± 14.94, P = 0.044). The mean diagnostic delay was 14.63 ± 14.31 years, and longer delay was associated with higher body mass index. The best predictors of short diagnostic delay were young age at diagnosis, cataplexy as the first symptom and higher frequency of cataplexy attacks. The mean multiple sleep latency negatively correlated with Epworth Sleepiness Scale (ESS) and with the number of sleep-onset rapid eye movement periods (SOREMPs), but none of the polysomnographic variables was associated with subjective or objective measures of sleepiness. Variant rs2859998 in UBXN2B gene showed a strong association (P = 1.28E-07) with the age at onset of excessive daytime sleepiness, and rs12425451 near the transcription factor TEAD4 (P = 1.97E-07) with the age at onset of cataplexy. Altogether, our results indicate that the diagnostic delay remains extremely long, age and gender substantially affect symptoms, and that a genetic predisposition affects the age at onset of symptoms.