961 resultados para Genome-wide linkage
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
We conducted a genome-wide association study for androgenic alopecia in 1,125 men and identified a newly associated locus at chromosome 20p11.22, confirmed in three independent cohorts (n = 1,650; OR = 1.60, P = 1.1 x 10(-14) for rs1160312). The one man in seven who harbors risk alleles at both 20p11.22 and AR (encoding the androgen receptor) has a sevenfold-increased odds of androgenic alopecia (OR = 7.12, P = 3.7 x 10(-15)).
Resumo:
The genomic loci occupied by RNA polymerase (RNAP) III have been characterized in human culture cells by genome-wide chromatin immunoprecipitations, followed by deep sequencing (ChIP-seq). These studies have shown that only ∼40% of the annotated 622 human tRNA genes and pseudogenes are occupied by RNAP-III, and that these genes are often in open chromatin regions rich in active RNAP-II transcription units. We have used ChIP-seq to characterize RNAP-III-occupied loci in a differentiated tissue, the mouse liver. Our studies define the mouse liver RNAP-III-occupied loci including a conserved mammalian interspersed repeat (MIR) as a potential regulator of an RNAP-III subunit-encoding gene. They reveal that synteny relationships can be established between a number of human and mouse RNAP-III genes, and that the expression levels of these genes are significantly linked. They establish that variations within the A and B promoter boxes, as well as the strength of the terminator sequence, can strongly affect RNAP-III occupancy of tRNA genes. They reveal correlations with various genomic features that explain the observed variation of 81% of tRNA scores. In mouse liver, loci represented in the NCBI37/mm9 genome assembly that are clearly occupied by RNAP-III comprise 50 Rn5s (5S RNA) genes, 14 known non-tRNA RNAP-III genes, nine Rn4.5s (4.5S RNA) genes, and 29 SINEs. Moreover, out of the 433 annotated tRNA genes, half are occupied by RNAP-III. Transfer RNA gene expression levels reflect both an underlying genomic organization conserved in dividing human culture cells and resting mouse liver cells, and the particular promoter and terminator strengths of individual genes.
Resumo:
Huntington's disease (HD) pathology is well understood at a histological level but a comprehensive molecular analysis of the effect of the disease in the human brain has not previously been available. To elucidate the molecular phenotype of HD on a genome-wide scale, we compared mRNA profiles from 44 human HD brains with those from 36 unaffected controls using microarray analysis. Four brain regions were analyzed: caudate nucleus, cerebellum, prefrontal association cortex [Brodmann's area 9 (BA9)] and motor cortex [Brodmann's area 4 (BA4)]. The greatest number and magnitude of differentially expressed mRNAs were detected in the caudate nucleus, followed by motor cortex, then cerebellum. Thus, the molecular phenotype of HD generally parallels established neuropathology. Surprisingly, no mRNA changes were detected in prefrontal association cortex, thereby revealing subtleties of pathology not previously disclosed by histological methods. To establish that the observed changes were not simply the result of cell loss, we examined mRNA levels in laser-capture microdissected neurons from Grade 1 HD caudate compared to control. These analyses confirmed changes in expression seen in tissue homogenates; we thus conclude that mRNA changes are not attributable to cell loss alone. These data from bona fide HD brains comprise an important reference for hypotheses related to HD and other neurodegenerative diseases.
Resumo:
The phenotypic effect of some single nucleotide polymorphisms (SNPs) depends on their parental origin. We present a novel approach to detect parent-of-origin effects (POEs) in genome-wide genotype data of unrelated individuals. The method exploits increased phenotypic variance in the heterozygous genotype group relative to the homozygous groups. We applied the method to >56,000 unrelated individuals to search for POEs influencing body mass index (BMI). Six lead SNPs were carried forward for replication in five family-based studies (of ∼4,000 trios). Two SNPs replicated: the paternal rs2471083-C allele (located near the imprinted KCNK9 gene) and the paternal rs3091869-T allele (located near the SLC2A10 gene) increased BMI equally (beta = 0.11 (SD), P<0.0027) compared to the respective maternal alleles. Real-time PCR experiments of lymphoblastoid cell lines from the CEPH families showed that expression of both genes was dependent on parental origin of the SNPs alleles (P<0.01). Our scheme opens new opportunities to exploit GWAS data of unrelated individuals to identify POEs and demonstrates that they play an important role in adult obesity.
Resumo:
BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.
Resumo:
SNAP(c) is one of a few basal transcription factors used by both RNA polymerase (pol) II and pol III. To define the set of active SNAP(c)-dependent promoters in human cells, we have localized genome-wide four SNAP(c) subunits, GTF2B (TFIIB), BRF2, pol II, and pol III. Among some seventy loci occupied by SNAP(c) and other factors, including pol II snRNA genes, pol III genes with type 3 promoters, and a few un-annotated loci, most are primarily occupied by either pol II and GTF2B, or pol III and BRF2. A notable exception is the RPPH1 gene, which is occupied by significant amounts of both polymerases. We show that the large majority of SNAP(c)-dependent promoters recruit POU2F1 and/or ZNF143 on their enhancer region, and a subset also recruits GABP, a factor newly implicated in SNAP(c)-dependent transcription. These activators associate with pol II and III promoters in G1 slightly before the polymerase, and ZNF143 is required for efficient transcription initiation complex assembly. The results characterize a set of genes with unique properties and establish that polymerase specificity is not absolute in vivo.
Resumo:
Ten years ago, the first cellular receptor for the prototypic arenavirus lymphocytic choriomeningitis virus (LCMV) and the highly pathogenic Lassa virus (LASV) was identified as alpha-dystroglycan (alpha-DG), a versatile receptor for proteins of the extracellular matrix (ECM). Biochemical analysis of the interaction of alpha-DG with arenaviruses and ECM proteins revealed a strikingly similar mechanism of receptor recognition that critically depends on specific sugar modification on alpha-DG involving a novel class of putative glycosyltransferase, the LARGE proteins. Interestingly, recent genome-wide detection and characterization of positive selection in human populations revealed evidence for positive selection of a locus within the LARGE gene in populations from Western Africa, where LASV is endemic. While most enveloped viruses that enter the host cell in a pH-dependent manner use clathrin-mediated endocytosis, recent studies revealed that the Old World arenaviruses LCMV and LASV enter the host cell predominantly via a novel and unusual endocytotic pathway independent of clathrin, caveolin, dynamin, and actin. Upon internalization, the virus is rapidly delivered to endosomes via an unusual route of vesicular trafficking that is largely independent of the small GTPases Rab5 and Rab7. Since infection of cells with LCMV and LASV depends on DG, this unusual endocytotic pathway could be related to normal cellular trafficking of the DG complex. Alternatively, engagement of arenavirus particles may target DG for an endocytotic pathway not normally used in uninfected cells thereby inducing an entry route specifically tailored to the pathogen's needs.
Resumo:
Recently, a locus centred on rs9273349 in the HLA-DQ region emerged from genome-wide association studies of adult-onset asthma. We aimed to further investigate the role of human leukocyte antigen (HLA) class II in adult-onset asthma and a possible interaction with occupational exposures. We imputed classical HLA-II alleles from 7579 single-nucleotide polymorphisms in 6025 subjects (1202 with adult-onset asthma) from European cohorts: ECRHS, SAPALDIA, EGEA and B58C, and from surveys of bakers and agricultural workers. Based on an asthma-specific job-exposure matrix, 2629 subjects had ever been exposed to high molecular weight (HMW) allergens. We explored associations between 23 common HLA-II alleles and adult-onset asthma, and tested for gene-environment interaction with occupational exposure to HMW allergens. Interaction was also tested for rs9273349. Marginal associations of classical HLA-II alleles and adult-onset asthma were not statistically significant. Interaction was detected between the DPB1*03:01 allele and exposure to HMW allergens (p = 0.009), in particular to latex (p = 0.01). In the unexposed group, the DPB1*03:01 allele was associated with adult-onset asthma (OR 0.67, 95%CI 0.53-0.86). HMW allergen exposures did not modify the association of rs9273349 with adult-onset asthma. Common classical HLA-II alleles were not marginally associated with adult-onset asthma. The association of latex exposure and adult-onset asthma may be modified by DPB1*03:01.
Resumo:
Cervical artery dissection (CeAD), a mural hematoma in a carotid or vertebral artery, is a major cause of ischemic stroke in young adults although relatively uncommon in the general population (incidence of 2.6/100,000 per year). Minor cervical traumas, infection, migraine and hypertension are putative risk factors, and inverse associations with obesity and hypercholesterolemia are described. No confirmed genetic susceptibility factors have been identified using candidate gene approaches. We performed genome-wide association studies (GWAS) in 1,393 CeAD cases and 14,416 controls. The rs9349379[G] allele (PHACTR1) was associated with lower CeAD risk (odds ratio (OR) = 0.75, 95% confidence interval (CI) = 0.69-0.82; P = 4.46 × 10(-10)), with confirmation in independent follow-up samples (659 CeAD cases and 2,648 controls; P = 3.91 × 10(-3); combined P = 1.00 × 10(-11)). The rs9349379[G] allele was previously shown to be associated with lower risk of migraine and increased risk of myocardial infarction. Deciphering the mechanisms underlying this pleiotropy might provide important information on the biological underpinnings of these disabling conditions.
Resumo:
Peripheral T-cell lymphomas (PTCLs) represent a heterogeneous group of more than 20 neoplastic entities derived from mature T cells and natural killer (NK) cells involved in innate and adaptive immunity. With few exceptions these malignancies, which may present as disseminated, predominantly extranodal or cutaneous, or predominantly nodal diseases, are clinically aggressive and have a dismal prognosis. Their diagnosis and classification is hampered by several difficulties, including a significant morphological and immunophenotypic overlap across different entities, and the lack of characteristic genetic alterations for most of them. Although there is increasing evidence that the cell of origin is a major determinant for the delineation of several PTCL entities, however, the cellular derivation of most entities remains poorly characterized and/or may be heterogeneous. The complexity of the biology and pathophysiology of PTCLs has been only partly deciphered. In recent years, novel insights have been gained from genome-wide profiling analyses. In this review, we will summarize the current knowledge on the pathobiological features of peripheral NK/T-cell neoplasms, with a focus on selected disease entities manifesting as tissue infiltrates primarily in extranodal sites and lymph nodes.
Resumo:
A genome-wide screen for large structural variants showed that a copy number variant (CNV) in the region encoding killer cell immunoglobulin-like receptors (KIR) associates with HIV-1 control as measured by plasma viral load at set point in individuals of European ancestry. This CNV encompasses the KIR3DL1-KIR3DS1 locus, encoding receptors that interact with specific HLA-Bw4 molecules to regulate the activation of lymphocyte subsets including natural killer (NK) cells. We quantified the number of copies of KIR3DS1 and KIR3DL1 in a large HIV-1 positive cohort, and showed that an increase in KIR3DS1 count associates with a lower viral set point if its putative ligand is present (p = 0.00028), as does an increase in KIR3DL1 count in the presence of KIR3DS1 and appropriate ligands for both receptors (p = 0.0015). We further provide functional data that demonstrate that NK cells from individuals with multiple copies of KIR3DL1, in the presence of KIR3DS1 and the appropriate ligands, inhibit HIV-1 replication more robustly, and associated with a significant expansion in the frequency of KIR3DS1+, but not KIR3DL1+, NK cells in their peripheral blood. Our results suggest that the relative amounts of these activating and inhibitory KIR play a role in regulating the peripheral expansion of highly antiviral KIR3DS1+ NK cells, which may determine differences in HIV-1 control following infection.
Resumo:
AbstractPlants are sessile organisms, which have evolved an astonishing ability to sense changes in their environment. Depending on the surrounding conditions, such as changes in light and temperature, plants modulate the activity of important transcriptional regulators. The shade avoidance syndrome (SAS) is one important mechanism for shade-intolerant plants to adapt their growth in high vegetative density. In shaded conditions plants sense a diminished red/far-red ratio via the phytochrome system and respond with morphological changes such as elongation growth of stems and petioles. The Phytochrome Interacting Factors 4 and 5 (PIF4 and PIF5) are positive regulators of the SAS and required for a full response (Lorrain et al, 2008). They regulate the SAS by inducing the expression of shade avoidance marker genes such as PIL1, ATHB2, XTR7 and HFR1 (Hornitschek et al, 2009; Lorrain et al, 2008).I investigated the molecular mechanism underlying the regulation of the SAS by HFR1 (long Hypocotyl in FR light). Although HFR1 is a PIF-related bHLH transcription factor, we discovered that HFR1 is a non-DNA binding protein. Moreover, we revealed that HFR1 inhibits an exaggerated SAS by forming non-DNA binding heterodimers with PIF4 and PIF5 (Hornitschek et al, 2009). This negative feedback loop is an important mechanism to limit elongation growth also in elevated temperatures. HFR1 accumulation and activity are highly temperature-dependent and the increased activity of HFR1 at warmer temperatures also provides an important restraint on PIF4-driven elongation growth (Foreman et al, 2011).Finally we performed a genome-wide analysis to determine how PIF4 and PIF5 regulate growth in response to shade. We identified potential PIF5- target genes, which represent many well-known shade-responsive genes. Our analysis of gene expression also revealed a role of PIF4 and PIF5 in simulated sun possibly via the regulation of auxin sensitivity.RésuméLes plantes sont des organismes sessiles ayant développé une capacité surprenante à détecter des changements dans leur environnement. En fonction des conditions extérieures, telles que les variations de lumière ou de température, elles adaptent l'activité d'importants régulateurs transcriptionnels. Le syndrome d'évitement de l'ombre (SAS), est un mécanisme important pour les plantes intolérantes à l'ombre leur permettant d'adapter leur croissance lorsqu'elles se développent dans des conditions de végétations très denses. Dans ces conditions, les plantes détectent une réduction de la quantité relative de lumière rouge par rapport à la lumière rouge-lointain (rapport R/FR). Ce changement, perçu via le système des phytochromes, induit des modifications morphologiques telle qu'une élongation des tiges et des pétioles. Les protéines PIF4 et PIF5 (Phytochrome Interacting Factors) sont des régulateurs positifs du SAS et sont nécessaires pour une réponse complète (Lorrain et al, 2008). Ces facteurs de transcription régulent le SAS en induisant l'expression de gènes marqueurs de cette réponse tels que PIL1, ATHB2, XTR7 et HFR1 (Hornitschek et al, 2009; Lorrain et al, 2008).J'ai étudié les mécanismes moléculaires sous-jacents à la régulation du SAS par HFR1 (long Hypocotyl in FR light). HFR1 est un facteur de transcription type bHLH de la famille des PIF, quoique nous ayons découvert que HFR1 est une protéine ne se liant pas à Γ ADN. Nous avons montré que HFR1 inhibe un SAS exagéré en formant des heterodimères avec PIF4 et PIF5 (Hornitschek et al, 2009). Nous avons également montré que cette boucle de régulation négative est également un mécanisme important pour limiter la croissance de l'élongation dans des conditions de fortes températures. De plus l'accumulation et l'activité de HFR1 augmentent avec la température ce qui permet d'inhiber plus fortement l'effet activateur de PIF4 sur la croissance.Enfin, nous avons effectué une analyse génomique à large échelle afin de déterminer comment PIF4 et PIF5 régulent la croissance en réponse à l'ombre. Nous avons identifié les gènes cibles potentiels de PIF5, correspondant en partie à des gènes connus dans la réponse de l'évitement de l'ombre. Notre analyse de l'expression des gènes a également révélé un rôle important de PIF4 et PIF5 dans des conditions de croissance en plein soleil, probablement via la régulation de la sensibilité à l'auxine.
Resumo:
Peripheral T-cell lymphomas (PTCLs) are heterogeneous and uncommon malignancies characterized by an aggressive clinical course and a mostly poor outcome with current treatment strategies. The recent genome-wide molecular characterization of several entities has provided novel insights into their pathobiology and led to the identification of new biomarkers with diagnostic, prognostic or therapeutic implications for PTCL patients. Cell lineage and differentiation antigens (markers of γδ or NK lineage, of cytotoxicity, of follicular helper T cells) reflect the tumour's biological behaviour, and their detection in tissue samples may refine the diagnostic and prognostic stratification of the patients. Previously unrecognized gene rearrangements are being discovered (ITK-SYK translocation, IRF4/MUM1 and DUSP22 rearrangements), and may serve as diagnostic genetic markers. Deregulated molecules within oncogenic pathways (NF-κB, Syk, PDGFRα) and immunoreactive cell-surface antigens (CD30, CD52) have been brought to the fore as potential targets for guiding the development of novel therapies.
Resumo:
Through genome-wide association meta-analyses of up to 133,010 individuals of European ancestry without diabetes, including individuals newly genotyped using the Metabochip, we have increased the number of confirmed loci influencing glycemic traits to 53, of which 33 also increase type 2 diabetes risk (q < 0.05). Loci influencing fasting insulin concentration showed association with lipid levels and fat distribution, suggesting impact on insulin resistance. Gene-based analyses identified further biologically plausible loci, suggesting that additional loci beyond those reaching genome-wide significance are likely to represent real associations. This conclusion is supported by an excess of directionally consistent and nominally significant signals between discovery and follow-up studies. Functional analysis of these newly discovered loci will further improve our understanding of glycemic control.