2 resultados para Functional Annotation
em WestminsterResearch - UK
Resumo:
BACKGROUND: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. METHOD: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast Cancer Association Consortium (BCAC; http://bcac.ccge.medschl.cam.ac.uk/ ), and in 15,252 BRCA1 mutation carriers in the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). Stepwise regression analyses were performed to identify independent association signals. Data from the Encyclopedia of DNA Elements project (ENCODE) and the Cancer Genome Atlas (TCGA) were used for functional annotation. RESULTS: Analysis of data from European descendants found evidence for four independent association signals at 12p11, represented by rs7297051 (odds ratio (OR) = 1.09, 95 % confidence interval (CI) = 1.06-1.12; P = 3 × 10(-9)), rs805510 (OR = 1.08, 95 % CI = 1.04-1.12, P = 2 × 10(-5)), and rs1871152 (OR = 1.04, 95 % CI = 1.02-1.06; P = 2 × 10(-4)) identified in the general populations, and rs113824616 (P = 7 × 10(-5)) identified in the meta-analysis of BCAC ER-negative cases and BRCA1 mutation carriers. SNPs rs7297051, rs805510 and rs113824616 were also associated with breast cancer risk at P < 0.05 in East Asians, but none of the associations were statistically significant in African descendants. Multiple candidate functional variants are located in putative enhancer sequences. Chromatin interaction data suggested that PTHLH was the likely target gene of these enhancers. Of the six variants with the strongest evidence of potential functionality, rs11049453 was statistically significantly associated with the expression of PTHLH and its nearby gene CCDC91 at P < 0.05. CONCLUSION: This study identified four independent association signals at 12p11 and revealed potentially functional variants, providing additional insights into the underlying biological mechanism(s) for the association observed between variants at 12p11 and breast cancer risk
Resumo:
We have developed an in-house pipeline for the processing and analyses of sequence data generated during Illumina technology-based metagenomic studies of the human gut microbiota. Each component of the pipeline has been selected following comparative analysis of available tools; however, the modular nature of software facilitates replacement of any individual component with an alternative should a better tool become available in due course. The pipeline consists of quality analysis and trimming followed by taxonomic filtering of sequence data allowing reads associated with samples to be binned according to whether they represent human, prokaryotic (bacterial/archaeal), viral, parasite, fungal or plant DNA. Viral, parasite, fungal and plant DNA can be assigned to species level on a presence/absence basis, allowing – for example – identification of dietary intake of plant-based foodstuffs and their derivatives. Prokaryotic DNA is subject to taxonomic and functional analyses, with assignment to taxonomic hierarchies (kingdom, class, order, family, genus, species, strain/subspecies) and abundance determination. After de novo assembly of sequence reads, genes within samples are predicted and used to build a non-redundant catalogue of genes. From this catalogue, per-sample gene abundance can be determined after normalization of data based on gene length. Functional annotation of genes is achieved through mapping of gene clusters against KEGG proteins, and InterProScan. The pipeline is undergoing validation using the human faecal metagenomic data of Qin et al. (2014, Nature 513, 59–64). Outputs from the pipeline allow development of tools for the integration of metagenomic and metabolomic data, moving metagenomic studies beyond determination of gene richness and representation towards microbial-metabolite mapping. There is scope to improve the outputs from viral, parasite, fungal and plant DNA analyses, depending on the depth of sequencing associated with samples. The pipeline can easily be adapted for the analyses of environmental and non-human animal samples, and for use with data generated via non-Illumina sequencing platforms.