79 resultados para landscape genomics
Resumo:
Progress in genomics with, in particular, high throughput next generation sequencing is revolutionizing oncology. The impact of these techniques is seen on the one hand the identification of germline mutations that predispose to a given type of cancer, allowing for a personalized care of patients or healthy carriers and, on the other hand, the characterization of all acquired somatic mutation of the tumor cell, opening the door to personalized treatment targeting the driver oncogenes. In both cases, next generation sequencing techniques allow a global approach whereby the integrality of the genome mutations is analyzed and correlated with the clinical data. The benefits on the quality of care delivered to our patients are extremely impressive.
Resumo:
Speciation is a fundamental evolutionary process, the knowledge of which is crucial for understanding the origins of biodiversity. Genomic approaches are an increasingly important aspect of this research field. We review current understanding of genome-wide effects of accumulating reproductive isolation and of genomic properties that influence the process of speciation. Building on this work, we identify emergent trends and gaps in our understanding, propose new approaches to more fully integrate genomics into speciation research, translate speciation theory into hypotheses that are testable using genomic tools and provide an integrative definition of the field of speciation genomics.
Resumo:
Animal dispersal in a fragmented landscape depends on the complex interaction between landscape structure and animal behavior. To better understand how individuals disperse, it is important to explicitly represent the properties of organisms and the landscape in which they move. A common approach to modelling dispersal includes representing the landscape as a grid of equal sized cells and then simulating individual movement as a correlated random walk. This approach uses a priori scale of resolution, which limits the representation of all landscape features and how different dispersal abilities are modelled. We develop a vector-based landscape model coupled with an object-oriented model for animal dispersal. In this spatially explicit dispersal model, landscape features are defined based on their geographic and thematic properties and dispersal is modelled through consideration of an organism's behavior, movement rules and searching strategies (such as visual cues). We present the model's underlying concepts, its ability to adequately represent landscape features and provide simulation of dispersal according to different dispersal abilities. We demonstrate the potential of the model by simulating two virtual species in a real Swiss landscape. This illustrates the model's ability to simulate complex dispersal processes and provides information about dispersal such as colonization probability and spatial distribution of the organism's path
Resumo:
DNA-binding proteins mediate a variety of crucial molecular functions, such as transcriptional regulation and chromosome maintenance, replication and repair, which in turn control cell division and differentiation. The roles of these proteins in disease are currently being investigated using microarray-based approaches. However, these assays can be difficult to adapt to routine diagnosis of complex diseases such as cancer. Here, we review promising alternative approaches involving protein-binding microarrays (PBMs) that probe the interaction of proteins from crude cell or tissue extracts with large collections of synthetic or natural DNA sequences. Recent studies have demonstrated the use of these novel PBM approaches to provide rapid and unbiased characterization of DNA-binding proteins as molecular markers of disease, for example cancer progression or infectious diseases.
Resumo:
BACKGROUND: Fourmidable is an infrastructure to curate and share the emerging genetic, molecular, and functional genomic data and protocols for ants. DESCRIPTION: The Fourmidable assembly pipeline groups nucleotide sequences into clusters before independently assembling each cluster. Subsequently, assembled sequences are annotated via Interproscan and BLAST against general and insect-specific databases. Gene-specific information can be retrieved using gene identifiers, searching for similar sequences or browsing through inferred Gene Ontology annotations. The database will readily scale as ultra-high throughput sequence data and sequences from additional species become available. CONCLUSION: Fourmidable currently houses EST data from two ant species and microarray gene expression data for one of these. Fourmidable is publicly available at http://fourmidable.unil.ch.
Resumo:
Using an original investigative protocol and a data base of 4,127 national delegates from ten Moroccan political organizations, surveyed between 2008 and 2012, this article examines the characteristics of party members in Morocco. Initial results indicate that the field of Moroccan political parties is a small world dominated by city dwellers, mature men, and the most highly educated, wealthiest individuals. However, far from being isolated from ordinary citizens, there are social dynamics at work. While it cannot be reduced to a segmented clientele, it is, nonetheless, shaped by an ideal-typical opposition between parties of notables and parties of activists.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Functional connectivity affects demography and gene dynamics in fragmented populations. Besides species-specific dispersal ability, the connectivity between local populations is affected by the landscape elements encountered during dispersal. Documenting these effects is thus a central issue for the conservation and management of fragmented populations. In this study, we compare the power and accuracy of three methods (partial correlations, regressions and Approximate Bayesian Computations) that use genetic distances to infer the effect of landscape upon dispersal. We use stochastic individual-based simulations of fragmented populations surrounded by landscape elements that differ in their permeability to dispersal. The power and accuracy of all three methods are good when there is a strong contrast between the permeability of different landscape elements. The power and accuracy can be further improved by restricting analyses to adjacent pairs of populations. Landscape elements that strongly impede dispersal are the easiest to identify. However, power and accuracy decrease drastically when landscape complexity increases and the contrast between the permeability of landscape elements decreases. We provide guidelines for future studies and underline the needs to evaluate or develop approaches that are more powerful.
Resumo:
Pneumocystis jirovecii is a fungus causing severe pneumonia in immuno-compromised patients. Progress in understanding its pathogenicity and epidemiology has been hampered by the lack of a long-term in vitro culture method. Obligate parasitism of this pathogen has been suggested on the basis of various features but remains controversial. We analysed the 7.0 Mb draft genome sequence of the closely related species Pneumocystis carinii infecting rats, which is a well established experimental model of the disease. We predicted 8'085 (redundant) peptides and 14.9% of them were mapped onto the KEGG biochemical pathways. The proteome of the closely related yeast Schizosaccharomyces pombe was used as a control for the annotation procedure (4'974 genes, 14.1% mapped). About two thirds of the mapped peptides of each organism (65.7% and 73.2%, respectively) corresponded to crucial enzymes for the basal metabolism and standard cellular processes. However, the proportion of P. carinii genes relative to those of S. pombe was significantly smaller for the "amino acid metabolism" category of pathways than for all other categories taken together (40 versus 114 against 278 versus 427, P<0.002). Importantly, we identified in P. carinii only 2 enzymes specifically dedicated to the synthesis of the 20 standard amino acids. By contrast all the 54 enzymes dedicated to this synthesis reported in the KEGG atlas for S. pombe were detected upon reannotation of S. pombe proteome (2 versus 54 against 278 versus 427, P<0.0001). This finding strongly suggests that species of the genus Pneumocystis are scavenging amino acids from their host's lung environment. Consequently, they would have no form able to live independently from another organism, and these parasites would be obligate in addition to being opportunistic. These findings have implications for the management of patients susceptible to P. jirovecii infection given that the only source of infection would be other humans.
Resumo:
Pneumocystis jirovecii is a fungal parasite that colonizes specifically humans and turns into an opportunistic pathogen in immunodeficient individuals. The fungus is able to reproduce extracellularly in host lungs without eliciting massive cellular death. The molecular mechanisms that govern this process are poorly understood, in part because of the lack of an in vitro culture system for Pneumocystis spp. In this study, we explored the origin and evolution of the putative biotrophy of P. jirovecii through comparative genomics and reconstruction of ancestral gene repertoires. We used the maximum parsimony method and genomes of related fungi of the Taphrinomycotina subphylum. Our results suggest that the last common ancestor of Pneumocystis spp. lost 2,324 genes in relation to the acquisition of obligate biotrophy. These losses may result from neutral drift and affect the biosyntheses of amino acids and thiamine, the assimilation of inorganic nitrogen and sulfur, and the catabolism of purines. In addition, P. jirovecii shows a reduced panel of lytic proteases and has lost the RNA interference machinery, which might contribute to its genome plasticity. Together with other characteristics, that is, a sex life cycle within the host, the absence of massive destruction of host cells, difficult culturing, and the lack of virulence factors, these gene losses constitute a unique combination of characteristics which are hallmarks of both obligate biotrophs and animal parasites. These findings suggest that Pneumocystis spp. should be considered as the first described obligate biotrophs of animals, whose evolution has been marked by gene losses.
Resumo:
In plants, an oligogene family encodes NADP-malic enzymes (NADP-me), which are responsible for various functions and exhibit different kinetics and expression patterns. In particular, a chloroplast isoform of NADP-me plays a key role in one of the three biochemical subtypes of C4 photosynthesis, an adaptation to warm environments that evolved several times independently during angiosperm diversification. By combining genomic and phylogenetic approaches, this study aimed at identifying the molecular mechanisms linked to the recurrent evolutions of C4-specific NADP-me in grasses (Poaceae). Genes encoding NADP-me (nadpme) were retrieved from genomes of model grasses and isolated from a large sample of C3 and C4 grasses. Genomic and phylogenetic analyses showed that 1) the grass nadpme gene family is composed of four main lineages, one of which is expressed in plastids (nadpme-IV), 2) C4-specific NADP-me evolved at least five times independently from nadpme-IV, and 3) some codons driven by positive selection underwent parallel changes during the multiple C4 origins. The C4 NADP-me being expressed in chloroplasts probably constrained its recurrent evolutions from the only plastid nadpme lineage and this common starting point limited the number of evolutionary paths toward a C4 optimized enzyme, resulting in genetic convergence. In light of the history of nadpme genes, an evolutionary scenario of the C4 phenotype using NADP-me is discussed.
Resumo:
During the genomic era, a large amount of whole-genome sequences accumulated, which identified many hypothetical proteins of unknown function. Rapidly, functional genomics, which is the research domain that assign a function to a given gene product, has thus been developed. Functional genomics of intracellular pathogenic bacteria exhibit specific peculiarities due to the fastidious growth of most of these intracellular micro-organisms, due to the close interaction with the host cell, due to the risk of contamination of experiments with host cell proteins and, for some strict intracellular bacteria such as Chlamydia, due to the absence of simple genetic system to manipulate the bacterial genome. To identify virulence factors of intracellular pathogenic bacteria, functional genomics often rely on bioinformatic analyses compared with model organisms such as Escherichia coli and Bacillus subtilis. The use of heterologous expression is another common approach. Given the intracellular lifestyle and the many effectors that are used by the intracellular bacteria to corrupt host cell functions, functional genomics is also often targeting the identification of new effectors such as those of the T4SS of Brucella and Legionella.
Resumo:
Studies of the structural basis of protein thermostability have produced a confusing picture. Small sets of proteins have been analyzed from a variety of thermophilic species, suggesting different structural features as responsible for protein thermostability. Taking advantage of the recent advances in structural genomics, we have compiled a relatively large protein structure dataset, which was constructed very carefully and selectively; that is, the dataset contains only experimentally determined structures of proteins from one specific organism, the hyperthermophilic bacterium Thermotoga maritima, and those of close homologs from mesophilic bacteria. In contrast to the conclusions of previous studies, our analyses show that oligomerization order, hydrogen bonds, and secondary structure play minor roles in adaptation to hyperthermophily in bacteria. On the other hand, the data exhibit very significant increases in the density of salt-bridges and in compactness for proteins from T.maritima. The latter effect can be measured by contact order or solvent accessibility, and network analysis shows a specific increase in highly connected residues in this thermophile. These features account for changes in 96% of the protein pairs studied. Our results provide a clear picture of protein thermostability in one species, and a framework for future studies of thermal adaptation.