947 resultados para Genomics
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Pneumocystis jirovecii is a fungus causing severe pneumonia in immuno-compromised patients. Progress in understanding its pathogenicity and epidemiology has been hampered by the lack of a long-term in vitro culture method. Obligate parasitism of this pathogen has been suggested on the basis of various features but remains controversial. We analysed the 7.0 Mb draft genome sequence of the closely related species Pneumocystis carinii infecting rats, which is a well established experimental model of the disease. We predicted 8'085 (redundant) peptides and 14.9% of them were mapped onto the KEGG biochemical pathways. The proteome of the closely related yeast Schizosaccharomyces pombe was used as a control for the annotation procedure (4'974 genes, 14.1% mapped). About two thirds of the mapped peptides of each organism (65.7% and 73.2%, respectively) corresponded to crucial enzymes for the basal metabolism and standard cellular processes. However, the proportion of P. carinii genes relative to those of S. pombe was significantly smaller for the "amino acid metabolism" category of pathways than for all other categories taken together (40 versus 114 against 278 versus 427, P<0.002). Importantly, we identified in P. carinii only 2 enzymes specifically dedicated to the synthesis of the 20 standard amino acids. By contrast all the 54 enzymes dedicated to this synthesis reported in the KEGG atlas for S. pombe were detected upon reannotation of S. pombe proteome (2 versus 54 against 278 versus 427, P<0.0001). This finding strongly suggests that species of the genus Pneumocystis are scavenging amino acids from their host's lung environment. Consequently, they would have no form able to live independently from another organism, and these parasites would be obligate in addition to being opportunistic. These findings have implications for the management of patients susceptible to P. jirovecii infection given that the only source of infection would be other humans.
Resumo:
Pneumocystis jirovecii is a fungal parasite that colonizes specifically humans and turns into an opportunistic pathogen in immunodeficient individuals. The fungus is able to reproduce extracellularly in host lungs without eliciting massive cellular death. The molecular mechanisms that govern this process are poorly understood, in part because of the lack of an in vitro culture system for Pneumocystis spp. In this study, we explored the origin and evolution of the putative biotrophy of P. jirovecii through comparative genomics and reconstruction of ancestral gene repertoires. We used the maximum parsimony method and genomes of related fungi of the Taphrinomycotina subphylum. Our results suggest that the last common ancestor of Pneumocystis spp. lost 2,324 genes in relation to the acquisition of obligate biotrophy. These losses may result from neutral drift and affect the biosyntheses of amino acids and thiamine, the assimilation of inorganic nitrogen and sulfur, and the catabolism of purines. In addition, P. jirovecii shows a reduced panel of lytic proteases and has lost the RNA interference machinery, which might contribute to its genome plasticity. Together with other characteristics, that is, a sex life cycle within the host, the absence of massive destruction of host cells, difficult culturing, and the lack of virulence factors, these gene losses constitute a unique combination of characteristics which are hallmarks of both obligate biotrophs and animal parasites. These findings suggest that Pneumocystis spp. should be considered as the first described obligate biotrophs of animals, whose evolution has been marked by gene losses.
Resumo:
The fundamentals of Real-time Polymerase Chain Reaction,Automated capillary electrophoresis -Sanger sequencing and Fragmentanalysis- and "Next-generation" sequencing are reviewed. An overview ofapplications is presented using our own examples carried out in our facility.
Resumo:
In plants, an oligogene family encodes NADP-malic enzymes (NADP-me), which are responsible for various functions and exhibit different kinetics and expression patterns. In particular, a chloroplast isoform of NADP-me plays a key role in one of the three biochemical subtypes of C4 photosynthesis, an adaptation to warm environments that evolved several times independently during angiosperm diversification. By combining genomic and phylogenetic approaches, this study aimed at identifying the molecular mechanisms linked to the recurrent evolutions of C4-specific NADP-me in grasses (Poaceae). Genes encoding NADP-me (nadpme) were retrieved from genomes of model grasses and isolated from a large sample of C3 and C4 grasses. Genomic and phylogenetic analyses showed that 1) the grass nadpme gene family is composed of four main lineages, one of which is expressed in plastids (nadpme-IV), 2) C4-specific NADP-me evolved at least five times independently from nadpme-IV, and 3) some codons driven by positive selection underwent parallel changes during the multiple C4 origins. The C4 NADP-me being expressed in chloroplasts probably constrained its recurrent evolutions from the only plastid nadpme lineage and this common starting point limited the number of evolutionary paths toward a C4 optimized enzyme, resulting in genetic convergence. In light of the history of nadpme genes, an evolutionary scenario of the C4 phenotype using NADP-me is discussed.
Resumo:
During the genomic era, a large amount of whole-genome sequences accumulated, which identified many hypothetical proteins of unknown function. Rapidly, functional genomics, which is the research domain that assign a function to a given gene product, has thus been developed. Functional genomics of intracellular pathogenic bacteria exhibit specific peculiarities due to the fastidious growth of most of these intracellular micro-organisms, due to the close interaction with the host cell, due to the risk of contamination of experiments with host cell proteins and, for some strict intracellular bacteria such as Chlamydia, due to the absence of simple genetic system to manipulate the bacterial genome. To identify virulence factors of intracellular pathogenic bacteria, functional genomics often rely on bioinformatic analyses compared with model organisms such as Escherichia coli and Bacillus subtilis. The use of heterologous expression is another common approach. Given the intracellular lifestyle and the many effectors that are used by the intracellular bacteria to corrupt host cell functions, functional genomics is also often targeting the identification of new effectors such as those of the T4SS of Brucella and Legionella.
Resumo:
Studies of the structural basis of protein thermostability have produced a confusing picture. Small sets of proteins have been analyzed from a variety of thermophilic species, suggesting different structural features as responsible for protein thermostability. Taking advantage of the recent advances in structural genomics, we have compiled a relatively large protein structure dataset, which was constructed very carefully and selectively; that is, the dataset contains only experimentally determined structures of proteins from one specific organism, the hyperthermophilic bacterium Thermotoga maritima, and those of close homologs from mesophilic bacteria. In contrast to the conclusions of previous studies, our analyses show that oligomerization order, hydrogen bonds, and secondary structure play minor roles in adaptation to hyperthermophily in bacteria. On the other hand, the data exhibit very significant increases in the density of salt-bridges and in compactness for proteins from T.maritima. The latter effect can be measured by contact order or solvent accessibility, and network analysis shows a specific increase in highly connected residues in this thermophile. These features account for changes in 96% of the protein pairs studied. Our results provide a clear picture of protein thermostability in one species, and a framework for future studies of thermal adaptation.
Resumo:
The study of immunity against infection can be framed in the context of genomics. First, long-term association with pathogens results in genomic signatures that result from positive selection. Evolutionary pressures tailor species or individual responses to pathogens, that may be associated with skewed patterns of immunity. Second, recent human population expansion carries an increasing burden of genetic mutation that can result in sporadic immunodeficiencies, and more generally, in diversity in susceptibility to infection. This review highlights current concepts and tools for the analysis of genomes and stresses the interest of these approaches in immunity.
Resumo:
A workshop recently held at the Ecole Polytechnique Federale de Lausanne (EPFL, Switzerland) was dedicated to understanding the genetic basis of adaptive change, taking stock of the different approaches developed in theoretical population genetics and landscape genomics and bringing together knowledge accumulated in both research fields. Indeed, an important challenge in theoretical population genetics is to incorporate effects of demographic history and population structure. But important design problems (e.g. focus on populations as units, focus on hard selective sweeps, no hypothesis-based framework in the design of the statistical tests) reduce their capability of detecting adaptive genetic variation. In parallel, landscape genomics offers a solution to several of these problems and provides a number of advantages (e.g. fast computation, landscape heterogeneity integration). But the approach makes several implicit assumptions that should be carefully considered (e.g. selection has had enough time to create a functional relationship between the allele distribution and the environmental variable, or this functional relationship is assumed to be constant). To address the respective strengths and weaknesses mentioned above, the workshop brought together a panel of experts from both disciplines to present their work and discuss the relevance of combining these approaches, possibly resulting in a joint software solution in the future.
Resumo:
PURPOSE OF REVIEW: The kidney plays an essential role in maintaining sodium and water balance, thereby controlling the volume and osmolarity of the extracellular body fluids, the blood volume and the blood pressure. The final adjustment of sodium and water reabsorption in the kidney takes place in cells of the distal part of the nephron in which a set of apical and basolateral transporters participate in vectorial sodium and water transport from the tubular lumen to the interstitium and, finally, to the general circulation. According to a current model, the activity and/or cell-surface expression of these transporters is/are under the control of a gene network composed of the hormonally regulated, as well as constitutively expressed, genes. It is proposed that this gene network may include new candidate genes for salt- and water-losing syndromes and for salt-sensitive hypertension. A new generation of functional genomics techniques have recently been applied to the characterization of this gene network. The purpose of this review is to summarize these studies and to discuss the potential of the different techniques for characterization of the renal transcriptome. RECENT FINDINGS: Recently, DNA microarrays and serial analysis of gene expression have been applied to characterize the kidney transcriptome in different in-vivo and in-vitro models. In these studies, a set of new interesting genes potentially involved in the regulation of sodium and water reabsorption by the kidney have been identified and are currently under detailed investigation. SUMMARY: Characterization of the kidney transcriptome is greatly expanding our knowledge of the gene networks involved in multiple kidney functions, including the maintenance of sodium and water homeostasis.
Resumo:
Microbial communities in animal guts are composed of diverse, specialized bacterial species, but little is known about how gut bacteria diversify to produce genetically and ecologically distinct entities. The gut microbiota of the honey bee, Apis mellifera, presents a useful model, because it consists of a small number of characteristic bacterial species, each showing signs of diversification. Here, we used single-cell genomics to study the variation within two species of the bee gut microbiota: Gilliamella apicola and Snodgrassella alvi. For both species, our analyses revealed extensive variation in intraspecific divergence of protein-coding genes but uniformly high levels of 16S rRNA similarity. In both species, the divergence of 16S rRNA loci appears to have been curtailed by frequent recombination within populations, while other genomic regions have continuously diverged. Furthermore, gene repertoires differ markedly among strains in both species, implying distinct metabolic capabilities. Our results show that, despite minimal divergence at 16S rRNA genes, in situ diversification occurs within gut communities and generates bacterial lineages with distinct ecological niches. Therefore, important dimensions of microbial diversity are not evident from analyses of 16S rRNA, and single cell genomics has potential to elucidate processes of bacterial diversification.
Resumo:
In this thesis, different genetic tools are used to investigate both natural variation and speciation in the Ficedula flycatcher system: pied (Ficedula hypoleuca) and collared (F. albicollis) flycatchers. The molecular evolution of a gene involved in postnatal body growth, GH, has shown high degree of conservation at the mature protein between birds and mammals, whereas the variation observed in its signal peptide seems to be adaptive in pied flycatcher (I & II). Speciation is the process by which reproductive barriers to gene flow evolve between populations, and understanding the mechanisms involved in pre- and post-zygotic isolation have been investigated in Ficedula flycatchers. The Z chromosome have been suggested to be the hotspot for genes involved in speciation, thus sequencing of 13 Z-linked coding genes from the two species in allopatry and sympatry have been conducted (III). Surprisingly, the majority of Z-linked genes seemed to be highly conserved, suggesting instead a potential involvement of regulatory regions. Previous studies have shown that genes involved in hybrid fitness, female preferences and male plumage colouration are sex-linked. Hence, three pigmentation genes have been investigated: MC1R, AGRP, and TYRP1. Of these three genes, TYRP1 was identified as a strong candidate to be associated with black-brown plumage variation in sympatric populations, and hence is a strong candidate for a gene contributing to pre-zygotic isolation (IV). In sympatric areas, where pied and collared flycatchers have overlapping breeding areas, hybridization sometimes occurs leading to the production of unfit hybrids. By using a proteomic approach a novel expression pattern in hybrids was revealed compared to the parental species (V) and differentially expressed proteins subsequently identified by sequence similarity (VI). In conclusion, the Z chromosome appears to play an important role in flycatcher speciation, but probably not at the coding level. In addition the novel expression patterns might give new insights into the maladaptive hybrids.