998 resultados para Human Genomics
Resumo:
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Pneumocystis jirovecii is a fungal parasite that colonizes specifically humans and turns into an opportunistic pathogen in immunodeficient individuals. The fungus is able to reproduce extracellularly in host lungs without eliciting massive cellular death. The molecular mechanisms that govern this process are poorly understood, in part because of the lack of an in vitro culture system for Pneumocystis spp. In this study, we explored the origin and evolution of the putative biotrophy of P. jirovecii through comparative genomics and reconstruction of ancestral gene repertoires. We used the maximum parsimony method and genomes of related fungi of the Taphrinomycotina subphylum. Our results suggest that the last common ancestor of Pneumocystis spp. lost 2,324 genes in relation to the acquisition of obligate biotrophy. These losses may result from neutral drift and affect the biosyntheses of amino acids and thiamine, the assimilation of inorganic nitrogen and sulfur, and the catabolism of purines. In addition, P. jirovecii shows a reduced panel of lytic proteases and has lost the RNA interference machinery, which might contribute to its genome plasticity. Together with other characteristics, that is, a sex life cycle within the host, the absence of massive destruction of host cells, difficult culturing, and the lack of virulence factors, these gene losses constitute a unique combination of characteristics which are hallmarks of both obligate biotrophs and animal parasites. These findings suggest that Pneumocystis spp. should be considered as the first described obligate biotrophs of animals, whose evolution has been marked by gene losses.
Resumo:
The human malaria parasite Plasmodium vivax is responsible for 25 - 40% of the similar to 515 million annual cases of malaria worldwide. Although seldom fatal, the parasite elicits severe and incapacitating clinical symptoms and often causes relapses months after a primary infection has cleared. Despite its importance as a major human pathogen, P. vivax is little studied because it cannot be propagated continuously in the laboratory except in non- human primates. We sequenced the genome of P. vivax to shed light on its distinctive biological features, and as a means to drive development of new drugs and vaccines. Here we describe the synteny and isochore structure of P. vivax chromosomes, and show that the parasite resembles other malaria parasites in gene content and metabolic potential, but possesses novel gene families and potential alternative invasion pathways not recognized previously. Completion of the P. vivax genome provides the scientific community with a valuable resource that can be used to advance investigation into this neglected species.
Resumo:
Neisseria meningitidis (Nm) is the major cause of septicemia and meningococcal meningitis. During the course of infection, it must adapt to different host environments as a crucial factor for survival. Despite the severity of meningococcal sepsis, little is known about how Nm adapts to permit survival and growth in human blood. A previous time-course transcriptome analysis, using an ex vivo model of human whole blood infection, showed that Nm alters the expression of nearly 30% of ORFs of the genome: major dynamic changes were observed in the expression of transcriptional regulators, transport and binding proteins, energy metabolism, and surface-exposed virulence factors. Starting from these data, mutagenesis studies of a subset of up-regulated genes were performed and the mutants were tested for the ability to survive in human whole blood; Nm mutant strains lacking the genes encoding NMB1483, NalP, Mip, NspA, Fur, TbpB, and LctP were sensitive to killing by human blood. Then, the analysis was extended to the whole Nm transcriptome in human blood, using a customized 60-mer oligonucleotide tiling microarray. The application of specifically developed software combined with this new tiling array allowed the identification of different types of regulated transcripts: small intergenic RNAs, antisense RNAs, 5’ and 3’ untranslated regions and operons. The expression of these RNA molecules was confirmed by 5’-3’RACE protocol and specific RT-PCR. Here we describe the complete transcriptome of Nm during incubation in human blood; we were able to identify new proteins important for survival in human blood and also to identify additional roles of previously known virulence factors in aiding survival in blood. In addition the tiling array analysis demonstrated that Nm expresses a set of new transcripts, not previously identified, and suggests the presence of a circuit of regulatory RNA elements used by Nm to adapt to proliferate in human blood.
Resumo:
Improvements in genomic technology, both in the increased speed and reduced cost of sequencing, have expanded the appreciation of the abundance of human genetic variation. However the sheer amount of variation, as well as the varying type and genomic content of variation, poses a challenge in understanding the clinical consequence of a single mutation. This work uses several methodologies to interpret the observed variation in the human genome, and presents novel strategies for the prediction of allele pathogenicity.
Using the zebrafish model system as an in vivo assay of allele function, we identified a novel driver of Bardet-Biedl Syndrome (BBS) in CEP76. A combination of targeted sequencing of 785 cilia-associated genes in a cohort of BBS patients and subsequent in vivo functional assays recapitulating the human phenotype gave strong evidence for the role of CEP76 mutations in the pathology of an affected family. This portion of the work demonstrated the necessity of functional testing in validating disease-associated mutations, and added to the catalogue of known BBS disease genes.
Further study into the role of copy-number variations (CNVs) in a cohort of BBS patients showed the significant contribution of CNVs to disease pathology. Using high-density array comparative genomic hybridization (aCGH) we were able to identify pathogenic CNVs as small as several hundred bp. Dissection of constituent gene and in vivo experiments investigating epistatic interactions between affected genes allowed for an appreciation of several paradigms by which CNVs can contribute to disease. This study revealed that the contribution of CNVs to disease in BBS patients is much higher than previously expected, and demonstrated the necessity of consideration of CNV contribution in future (and retrospective) investigations of human genetic disease.
Finally, we used a combination of comparative genomics and in vivo complementation assays to identify second-site compensatory modification of pathogenic alleles. These pathogenic alleles, which are found compensated in other species (termed compensated pathogenic deviations [CPDs]), represent a significant fraction (from 3 – 10%) of human disease-associated alleles. In silico pathogenicity prediction algorithms, a valuable method of allele prioritization, often misrepresent these alleles as benign, leading to omission of possibly informative variants in studies of human genetic disease. We created a mathematical model that was able to predict CPDs and putative compensatory sites, and functionally showed in vivo that second-site mutation can mitigate the pathogenicity of disease alleles. Additionally, we made publically available an in silico module for the prediction of CPDs and modifier sites.
These studies have advanced the ability to interpret the pathogenicity of multiple types of human variation, as well as made available tools for others to do so as well.
Resumo:
Background: The development and progression of cancer depend on its genetic characteristics as well as on the interactions with its microenvironment. Understanding these interactions may contribute to diagnostic and prognostic evaluations and to the development of new cancer therapies. Aiming to investigate potential mechanisms by which the tumor microenvironment might contribute to a cancer phenotype, we evaluated soluble paracrine factors produced by stromal and neoplastic cells which may influence proliferation and gene and protein expression. Methods: The study was carried out on the epithelial cancer cell line (Hep-2) and fibroblasts isolated from a primary oral cancer. We combined a conditioned-medium technique with subtraction hybridization approach, quantitative PCR and proteomics, in order to evaluate gene and protein expression influenced by soluble paracrine factors produced by stromal and neoplastic cells. Results: We observed that conditioned medium from fibroblast cultures (FCM) inhibited proliferation and induced apoptosis in Hep-2 cells. In neoplastic cells, 41 genes and 5 proteins exhibited changes in expression levels in response to FCM and, in fibroblasts, 17 genes and 2 proteins showed down-regulation in response to conditioned medium from Hep-2 cells (HCM). Nine genes were selected and the expression results of 6 down-regulated genes (ARID4A, CALR, GNB2L1, RNF10, SQSTM1, USP9X) were validated by real time PCR. Conclusions: A significant and common denominator in the results was the potential induction of signaling changes associated with immune or inflammatory response in the absence of a specific protein.
Resumo:
Background: While microRNAs (miRNAs) play important roles in tissue differentiation and in maintaining basal physiology, little is known about the miRNA expression levels in stomach tissue. Alterations in the miRNA profile can lead to cell deregulation, which can induce neoplasia. Methodology/Principal Findings: A small RNA library of stomach tissue was sequenced using high-throughput SOLiD sequencing technology. We obtained 261,274 quality reads with perfect matches to the human miRnome, and 42% of known miRNAs were identified. Digital Gene Expression profiling (DGE) was performed based on read abundance and showed that fifteen miRNAs were highly expressed in gastric tissue. Subsequently, the expression of these miRNAs was validated in 10 healthy individuals by RT-PCR showed a significant correlation of 83.97% (P<0.05). Six miRNAs showed a low variable pattern of expression (miR-29b, miR-29c, miR-19b, miR-31, miR-148a, miR-451) and could be considered part of the expression pattern of the healthy gastric tissue. Conclusions/Significance: This study aimed to validate normal miRNA profiles of human gastric tissue to establish a reference profile for healthy individuals. Determining the regulatory processes acting in the stomach will be important in the fight against gastric cancer, which is the second-leading cause of cancer mortality worldwide.
Resumo:
Background: The thymus is a central lymphoid organ, in which bone marrow-derived T cell precursors undergo a complex process of maturation. Developing thymocytes interact with thymic microenvironment in a defined spatial order. A component of thymic microenvironment, the thymic epithelial cells, is crucial for the maturation of T-lymphocytes through cell-cell contact, cell matrix interactions and secretory of cytokines/chemokines. There is evidence that extracellular matrix molecules play a fundamental role in guiding differentiating thymocytes in both cortical and medullary regions of the thymic lobules. The interaction between the integrin alpha 5 beta 1 (CD49e/CD29; VLA-5) and fibronectin is relevant for thymocyte adhesion and migration within the thymic tissue. Our previous results have shown that adhesion of thymocytes to cultured TEC line is enhanced in the presence of fibronectin, and can be blocked with anti-VLA-5 antibody. Results: Herein, we studied the role of CD49e expressed by the human thymic epithelium. For this purpose we knocked down the CD49e by means of RNA interference. This procedure resulted in the modulation of more than 100 genes, some of them coding for other proteins also involved in adhesion of thymocytes; others related to signaling pathways triggered after integrin activation, or even involved in the control of F-actin stress fiber formation. Functionally, we demonstrated that disruption of VLA-5 in human TEC by CD49e-siRNA-induced gene knockdown decreased the ability of TEC to promote thymocyte adhesion. Such a decrease comprised all CD4/CD8-defined thymocyte subsets. Conclusion: Conceptually, our findings unravel the complexity of gene regulation, as regards key genes involved in the heterocellular cell adhesion between developing thymocytes and the major component of the thymic microenvironment, an interaction that is a mandatory event for proper intrathymic T cell differentiation.
Resumo:
It is well accepted that the Americas were the last continents reached by modern humans, most likely through Beringia. However, the precise time and mode of the colonization of the New World remain hotly disputed issues. Native American populations exhibit almost exclusively five mitochondrial DNA (mtDNA) haplogroups (A-D and X). Haplogroups A-D are also frequent in Asia, suggesting a northeastern Asian origin of these lineages. However, the differential pattern of distribution and frequency of haplogroup X led some to suggest that it may represent an independent migration to the Americas. Here we show, by using 86 complete mitochondrial genomes, that all Native American haplogroups, including haplogroup X, were part of a single founding population, thereby refuting multiple-migration models. A detailed demographic history of the mtDNA sequences estimated with a Bayesian coalescent method indicates a complex model for the peopling of the Americas, in which the initial differentiation from Asian populations ended with a moderate bottleneck in Beringia during the last glacial maximum (LGM), around similar to 23,000 to similar to 19,000 years ago. Toward the end of the LGM, a strong population expansion started similar to 18,000 and finished similar to 15,000 years ago. These results support a pre-Clovis occupation of the New World, suggesting a rapid settlement of the continent along a Pacific coastal route.
Resumo:
The enormous amount of information generated through sequencing of the human genome has increased demands for more economical and flexible alternatives in genomics, proteomics and drug discovery. Many companies and institutions have recognised the potential of increasing the size and complexity of chemical libraries by producing large chemical libraries on colloidal support beads. Since colloid-based compounds in a suspension are randomly located, an encoding system such as optical barcoding is required to permit rapid elucidation of the compound structures. We describe in this article innovative methods for optical barcoding of colloids for use as support beads in both combinatorial and non-combinatorial libraries. We focus in particular on the difficult problem of barcoding extremely large libraries, which if solved, will transform the manner in which genomics, proteomics and drug discovery research is currently performed.
Resumo:
This paper studies the human DNA in the perspective of signal processing. Six wavelets are tested for analyzing the information content of the human DNA. By adopting real Shannon wavelet several fundamental properties of the code are revealed. A quantitative comparison of the chromosomes and visualization through multidimensional and dendograms is developed.
Resumo:
Nutrition science has evolved into a multidisciplinary field that applies molecular biology and integrates individual health with the epidemiologic investigation of population health. Nutritional genomics studies the functional interaction of food and its components, macro and micronutrients, with the genome at the molecular, cellular, and systemic level. Diet can influence cancer development in several ways, namely direct action of carcinogens in food that can damage DNA, diet components (macro or micronutrients) that can block or induce enzymes involved in activation or deactivation of carcinogenic substances. Moreover, inadequate intake of some molecules involved in DNA synthesis, repair or methylation can influence mutation rate or changes in gene expression. Several studies support the idea that diet can influence the risk of cancer; however information concerning the precise dietary factor that determines human cancer is an ongoing debate. A lot of epidemiological studies, involving food frequency questionnaires, have been developed providing important information concerning diet and cancer, however, diet is a complex composite of various nutrients (macro and micronutrients) and non-nutritive food constituents that makes the search for specific factors almost limitless.
Resumo:
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Resumo:
AbstractThe Chlamydiales order is an important bacterial phylum that comprises some of the most successful human pathogens such as Chlamydia trachomatis, the leading infectious cause of blindness worldwide. Since some years, several new bacteria related to Chlamydia have been discovered in clinical or environmental samples and might represent emerging pathogens. The genome sequencing of classical Chlamydia has brought invaluable information on these obligate intracellular bacteria otherwise difficult to study due to the lack of tools to perform basic genetic manipulation. The recent emergence of high-throughput sequencing technologies yielding millions of reads in a short time lowered the costs of genome sequencing and thus represented a unique opportunity to study Chlamydia-re\ated bacteria. Based on the sequencing and the analysis of Chlamydiales genomes, this thesis provides significant insights into the genetic determinants of the intracellular lifestyle, the pathogenicity, the metabolism and the evolution of Chlamydia-related bacteria. A first approach showed the efficacy of rapid sequencing coupled to proteomics to identify immunogenic proteins. This method, particularly useful for an emerging pathogen such as Parachlamydia acanthamoebae, enabled us to discover good candidates for the development of diagnostic tools that would permit to evaluate at larger scale the role of this bacterium in disease. Second, the complete genome of Waddlia chondrophila, a potential agent of miscarriage, encodes numerous virulence factors to manipulate its host cell and resist to environmental stresses. The reconstruction of metabolic pathways showed that the bacterium possesses extensive capabilities compared to related organisms. However, it is still incapable of synthesizing some essential components and thus has to import them from its host. Third, the genome comparison of Protochlamydia naegleriophila to its closest known relative Protochlamydia amoebophila revealed a particular evolutionary dynamic with the occurrence of an unexpected genome rearrangement. Fourth, a phylogenetic analysis of P. acanthamoebae and Legionella drancourtii identified several genes probably exchanged by horizontal gene transfer with other intracellular bacteria that might occur within their amoebal host. These genes often encode mechanisms for resistance to metal or toxic compounds. As a whole, the analysis of the different genomes enabled us to highlight a large diversity in size, GC percentage, repeat content as well as plasmid organization. The abundant genomic data obtained during this thesis have a wide impact since they provide the necessary bases for detailed investigations on countless aspects of the biology and the evolution of Chlamydia-related bacteria, whether in wet lab or by bioinformatical analyses.RésuméL'ordre des Chlamydiales est un important phylum bactérien qui comprend de nombreuses espèces pathogènes pour l'homme et les animaux, dont Chlamydia trachomatis, responsable du trachome, la cause majeure de cécité d'origine infectieuse à travers le monde. Durant ces dernières décennies, de nombreuses bactéries apparentées aux Chlamydia ont été découvertes dans des échantillons environnementaux ou cliniques mais leur éventuel rôle pathogène dans le développement de maladies reste peu connu. Ces bactéries sont des intracellulaires obligatoires car elles ont besoin d'une cellule hôte pour se multiplier, ce qui rend leur étude particulièrement difficile. Le développement de nouvelles technologies permettant de séquencer le génome d'un organisme rapidement et à moindre coût ainsi que l'essor des méthodes d'analyse s'y rapportant représentent une opportunité exceptionnelle d'étudier ces organismes. Dans ce contexte, cette thèse démontre l'utilité de la génomique pour développer de nouveaux outils diagnostiques ainsi que pour étudier le métabolisme de ces bactéries, leurs facteurs de virulence et leur évolution.Ainsi, une première approche a illustré l'utilité d'un séquençage rapide pour obtenir les informations nécessaires à l'identification de protéines qui sont reconnues par des anticorps humains ou animaux. Cette méthode, particulièrement utile pour un pathogène émergent tel que Parachlamydia acanthamoebae, a permis de découvrir de bons candidats pour le développement d'un outil diagnostique qui permettrait d'évaluer à plus large échelle le rôle de cette bactérie notamment dans la pneumonie. L'analyse du contenu génique de Waddlia chondrophila, un autre germe qui pourrait être impliqué dans les avortements et tes fausses-couches, a en outre mis en évidence la présence de nombreux facteurs connus qui lui permettent de manipuler son hôte. Cette bactérie possède de plus grandes capacités métaboliques que les autres Chlamydia, mais elle est incapable de synthétiser certains composants et doit donc les importer de son hôte pour subvenir à ses besoins. La comparaison du génome de Protochlamydia naegleriophila à son plus proche parent, Protochlamydia amoebophila, a dévoilé une évolution dynamique particulière avec l'occurrence d'un réarrangement majeur inattendu après la séparation de ces deux espèces. En outre, ces études ont montré l'occurrence de plusieurs transferts de gène avec d'autres organismes plus éloignés, notamment d'autres intracellulaires d'amibes, souvent pour l'acquisition de mécanismes de résistances à des composés toxiques. Les données génomiques acquises durant ce travail posent les fondements nécessaires a de nombreuses analyses qui permettront progressivement de mieux comprendre de nombreux aspects de ces bactéries fascinantes.