939 resultados para genome wide complex trait analysis
Resumo:
BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.
Resumo:
Interactions of cell-autonomous circadian oscillators with diurnal cycles govern the temporal compartmentalization of cell physiology in mammals. To understand the transcriptional and epigenetic basis of diurnal rhythms in mouse liver genome-wide, we generated temporal DNA occupancy profiles by RNA polymerase II (Pol II) as well as profiles of the histone modifications H3K4me3 and H3K36me3. We used these data to quantify the relationships of phases and amplitudes between different marks. We found that rhythmic Pol II recruitment at promoters rather than rhythmic transition from paused to productive elongation underlies diurnal gene transcription, a conclusion further supported by modeling. Moreover, Pol II occupancy preceded mRNA accumulation by 3 hours, consistent with mRNA half-lives. Both methylation marks showed that the epigenetic landscape is highly dynamic and globally remodeled during the 24-hour cycle. While promoters of transcribed genes had tri-methylated H3K4 even at their trough activity times, tri-methylation levels reached their peak, on average, 1 hour after Pol II. Meanwhile, rhythms in tri-methylation of H3K36 lagged transcription by 3 hours. Finally, modeling profiles of Pol II occupancy and mRNA accumulation identified three classes of genes: one showing rhythmicity both in transcriptional and mRNA accumulation, a second class with rhythmic transcription but flat mRNA levels, and a third with constant transcription but rhythmic mRNAs. The latter class emphasizes widespread temporally gated posttranscriptional regulation in the mouse liver.
Resumo:
Genome-wide association studies (GWAS) are conducted with the promise to discover novel genetic variants associated with diverse traits. For most traits, associated markers individually explain just a modest fraction of the phenotypic variation, but their number can well be in the hundreds. We developed a maximum likelihood method that allows us to infer the distribution of associated variants even when many of them were missed by chance. Compared to previous approaches, the novelty of our method is that it (a) does not require having an independent (unbiased) estimate of the effect sizes; (b) makes use of the complete distribution of P-values while allowing for the false discovery rate; (c) takes into account allelic heterogeneity and the SNP pruning strategy. We applied our method to the latest GWAS meta-analysis results of the GIANT consortium. It revealed that while the explained variance of genome-wide (GW) significant SNPs is around 1% for waist-hip ratio (WHR), the observed P-values provide evidence for the existence of variants explaining 10% (CI=[8.5-11.5%]) of the phenotypic variance in total. Similarly, the total explained variance likely to exist for height is estimated to be 29% (CI=[28-30%]), three times higher than what the observed GW significant SNPs give rise to. This methodology also enables us to predict the benefit of future GWA studies that aim to reveal more associated genetic markers via increased sample size.
Resumo:
Major depressive disorder (MDD) is a highly prevalent disorder with substantial heritability. Heritability has been shown to be substantial and higher in the variant of MDD characterized by recurrent episodes of depression. Genetic studies have thus far failed to identify clear and consistent evidence of genetic risk factors for MDD. We conducted a genome-wide association study (GWAS) in two independent datasets. The first GWAS was performed on 1022 recurrent MDD patients and 1000 controls genotyped on the Illumina 550 platform. The second was conducted on 492 recurrent MDD patients and 1052 controls selected from a population-based collection, genotyped on the Affymetrix 5.0 platform. Neither GWAS identified any SNP that achieved GWAS significance. We obtained imputed genotypes at the Illumina loci for the individuals genotyped on the Affymetrix platform, and performed a meta-analysis of the two GWASs for this common set of approximately half a million SNPs. The meta-analysis did not yield genome-wide significant results either. The results from our study suggest that SNPs with substantial odds ratio are unlikely to exist for MDD, at least in our datasets and among the relatively common SNPs genotyped or tagged by the half-million-loci arrays. Meta-analysis of larger datasets is warranted to identify SNPs with smaller effects or with rarer allele frequencies that contribute to the risk of MDD.
Resumo:
Gene transfer in eukaryotic cells and organisms suffers from epigenetic effects that result in low or unstable transgene expression and high clonal variability. Use of epigenetic regulators such as matrix attachment regions (MARs) is a promising approach to alleviate such unwanted effects. Dissection of a known MAR allowed the identification of sequence motifs that mediate elevated transgene expression. Bioinformatics analysis implied that these motifs adopt a curved DNA structure that positions nucleosomes and binds specific transcription factors. From these observations, we computed putative MARs from the human genome. Cloning of several predicted MARs indicated that they are much more potent than the previously known element, boosting the expression of recombinant proteins from cultured cells as well as mediating high and sustained expression in mice. Thus we computationally identified potent epigenetic regulators, opening new strategies toward high and stable transgene expression for research, therapeutic production or gene-based therapies.
Resumo:
While it is widely acknowledged that the ubiquitin-proteasome system plays an important role in transcription, little is known concerning the mechanistic basis, in particular the spatial organization of proteasome-dependent proteolysis at the transcription site. Here, we show that proteasomal activity and tetraubiquitinated proteins concentrate to nucleoplasmic microenvironments in the euchromatin. Such proteolytic domains are immobile and distinctly positioned in relation to transcriptional processes. Analysis of gene arrays and early genes in Caenorhabditis elegans embryos reveals that proteasomes and proteasomal activity are distantly located relative to transcriptionally active genes. In contrast, transcriptional inhibition generally induces local overlap of proteolytic microdomains with components of the transcription machinery and degradation of RNA polymerase II. The results establish that spatial organization of proteasomal activity differs with respect to distinct phases of the transcription cycle in at least some genes, and thus might contribute to the plasticity of gene expression in response to environmental stimuli.
Resumo:
Two cost-efficient genome-scale methodologies to assess DNA-methylation are MethylCap-seq and Illumina's Infinium HumanMethylation450 BeadChips (HM450). Objective information regarding the best-suited methodology for a specific research question is scant. Therefore, we performed a large-scale evaluation on a set of 70 brain tissue samples, i.e. 65 glioblastoma and 5 non-tumoral tissues. As MethylCap-seq coverages were limited, we focused on the inherent capacity of the methodology to detect methylated loci rather than a quantitative analysis. MethylCap-seq and HM450 data were dichotomized and performances were compared using a gold standard free Bayesian modelling procedure. While conditional specificity was adequate for both approaches, conditional sensitivity was systematically higher for HM450. In addition, genome-wide characteristics were compared, revealing that HM450 probes identified substantially fewer regions compared to MethylCap-seq. Although results indicated that the latter method can detect more potentially relevant DNA-methylation, this did not translate into the discovery of more differentially methylated loci between tumours and controls compared to HM450. Our results therefore indicate that both methodologies are complementary, with a higher sensitivity for HM450 and a far larger genome-wide coverage for MethylCap-seq, but also that a more comprehensive character does not automatically imply more significant results in biomarker studies.
Resumo:
Background: The human condition known as Premature Ovarian Failure (POF) is characterized by loss of ovarian function before the age of 40. A majority of POF cases are sporadic, but 10–15% are familial, suggesting a genetic origin of the disease. Although several causal mutations have been identified, the etiology of POF is still unknown for about 90% of the patients. Methodology/Principal Findings: We report a genome-wide linkage and homozygosity analysis in one large consanguineous Middle-Eastern POF-affected family presenting an autosomal recessive pattern of inheritance. We identified two regions with a LODmax of 3.26 on chromosome 7p21.1-15.3 and 7q21.3-22.2, which are supported as candidate regions by homozygosity mapping. Sequencing of the coding exons and known regulatory sequences of three candidate genes (DLX5, DLX6 and DSS1) included within the largest region did not reveal any causal mutations. Conclusions/Significance: We detect two novel POF-associated loci on human chromosome 7, opening the way to the identification of new genes involved in the control of ovarian development and function.
Resumo:
Although commonplace in human disease genetics, genome-wide association (GWA) studies have only relatively recently been applied to plants. Using 32 phenotypes in the inbreeding crop barley, we report GWA mapping of 15 morphological traits across ∼500 cultivars genotyped with 1,536 SNPs. In contrast to the majority of human GWA studies, we observe high levels of linkage disequilibrium within and between chromosomes. Despite this, GWA analysis readily detected common alleles of high penetrance. To investigate the potential of combining GWA mapping with comparative analysis to resolve traits to candidate polymorphism level in unsequenced genomes, we fine-mapped a selected phenotype (anthocyanin pigmentation) within a 140-kb interval containing three genes. Of these, resequencing the putative anthocyanin pathway gene HvbHLH1 identified a deletion resulting in a premature stop codon upstream of the basic helix-loop-helix domain, which was diagnostic for lack of anthocyanin in our association and biparental mapping populations. The methodology described here is transferable to species with limited genomic resources, providing a paradigm for reducing the threshold of map-based cloning in unsequenced crops.
Resumo:
Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10(-7). Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits.
Resumo:
Proneural genes such as Ascl1 are known to promote cell cycle exit and neuronal differentiation when expressed in neural progenitor cells. The mechanisms by which proneural genes activate neurogenesis--and, in particular, the genes that they regulate--however, are mostly unknown. We performed a genome-wide characterization of the transcriptional targets of Ascl1 in the embryonic brain and in neural stem cell cultures by location analysis and expression profiling of embryos overexpressing or mutant for Ascl1. The wide range of molecular and cellular functions represented among these targets suggests that Ascl1 directly controls the specification of neural progenitors as well as the later steps of neuronal differentiation and neurite outgrowth. Surprisingly, Ascl1 also regulates the expression of a large number of genes involved in cell cycle progression, including canonical cell cycle regulators and oncogenic transcription factors. Mutational analysis in the embryonic brain and manipulation of Ascl1 activity in neural stem cell cultures revealed that Ascl1 is indeed required for normal proliferation of neural progenitors. This study identified a novel and unexpected activity of the proneural gene Ascl1, and revealed a direct molecular link between the phase of expansion of neural progenitors and the subsequent phases of cell cycle exit and neuronal differentiation.
Resumo:
Mathematical ability is heritable, but few studies have directly investigated its molecular genetic basis. Here we aimed to identify specific genetic contributions to variation in mathematical ability. We carried out a genome wide association scan using pooled DNA in two groups of U.K. samples, based on end of secondary/high school national academic exam achievement: high (n = 419) versus low (n = 183) mathematical ability while controlling for their verbal ability. Significant differences in allele frequencies between these groups were searched for in 906,600 SNPs using the Affymetrix GeneChip Human Mapping version 6.0 array. After meeting a threshold of p<1.5×10-5, 12 SNPs from the pooled association analysis were individually genotyped in 542 of the participants and analyzed to validate the initial associations (lowest p-value 1.14 ×10-6). In this analysis, one of the SNPs (rs789859) showed significant association after Bonferroni correction, and four (rs10873824, rs4144887, rs12130910 rs2809115) were nominally significant (lowest p-value 3.278 × 10-4). Three of the SNPs of interest are located within, or near to, known genes (FAM43A, SFT2D1, C14orf64). The SNP that showed the strongest association, rs789859, is located in a region on chromosome 3q29 that has been previously linked to learning difficulties and autism. rs789859 lies 1.3 kbp downstream of LSG1, and 700 bp upstream of FAM43A, mapping within the potential promoter/regulatory region of the latter. To our knowledge, this is only the second study to investigate the association of genetic variants with mathematical ability, and it highlights a number of interesting markers for future study.
Resumo:
5-Hydroxymethylcytosine (5hmC), a modified form of cytosine that is considered the sixth nucleobase in DNA, has been detected in mammals and is believed to play an important role in gene regulation. In this study, 5hmC modification was detected in rice by employing a dot-blot assay, and its levels was further quantified in DNA from different rice tissues using liquid chromatography-multistage mass spectrometry (LC-MS/MS/MS). The results showed large intertissue variation in 5hmC levels. The genome-wide profiles of 5hmC modification in three different rice cultivars were also obtained using a sensitive chemical labelling followed by a next-generation sequencing method. Thousands of 5hmC peaks were identified, and a comparison of the distributions of 5hmC among different rice cultivars revealed the specificity and conservation of 5hmC modification. The identified 5hmC peaks were significantly enriched in heterochromatin regions,and mainly located in transposable element (TE) genes, especially around retrotransposons. The correlation analysis of 5hmC and gene expression data revealed a close association between 5hmC and silent TEs. These findings provide a resource for plant DNA 5hmC epigenetic studies and expand our knowledge of 5hmC modification.
Resumo:
Background Anxiety disorders are common, and cognitive–behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. Aims To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Method Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. Results No variants passed a genome-wide significance threshold (P = 5×10−8) in either analysis. Four variants met criteria for suggestive significance (P<5×10−6) in association with response post-treatment, and three variants in the 6-month follow-up analysis. Conclusions This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts.
Resumo:
Horses were domesticated from the Eurasian steppes 5,000-6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. © 2013 Petersen et al.