974 resultados para Expressed sequence tag analysis
Resumo:
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
Resumo:
We have used massively parallel signature sequencing (MPSS) to sample the transcriptomes of 32 normal human tissues to an unprecedented depth, thus documenting the patterns of expression of almost 20,000 genes with high sensitivity and specificity. The data confirm the widely held belief that differences in gene expression between cell and tissue types are largely determined by transcripts derived from a limited number of tissue-specific genes, rather than by combinations of more promiscuously expressed genes. Expression of a little more than half of all known human genes seems to account for both the common requirements and the specific functions of the tissues sampled. A classification of tissues based on patterns of gene expression largely reproduces classifications based on anatomical and biochemical properties. The unbiased sampling of the human transcriptome achieved by MPSS supports the idea that most human genes have been mapped, if not functionally characterized. This data set should prove useful for the identification of tissue-specific genes, for the study of global changes induced by pathological conditions, and for the definition of a minimal set of genes necessary for basic cell maintenance. The data are available on the Web at http://mpss.licr.org and http://sgb.lynxgen.com.
Resumo:
BACKGROUND: Cancer/testis (CT) genes are normally expressed only in germ cells, but can be activated in the cancer state. This unusual property, together with the finding that many CT proteins elicit an antigenic response in cancer patients, has established a role for this class of genes as targets in immunotherapy regimes. Many families of CT genes have been identified in the human genome, but their biological function for the most part remains unclear. While it has been shown that some CT genes are under diversifying selection, this question has not been addressed before for the class as a whole. RESULTS: To shed more light on this interesting group of genes, we exploited the generation of a draft chimpanzee (Pan troglodytes) genomic sequence to examine CT genes in an organism that is closely related to human, and generated a high-quality, manually curated set of human:chimpanzee CT gene alignments. We find that the chimpanzee genome contains homologues to most of the human CT families, and that the genes are located on the same chromosome and at a similar copy number to those in human. Comparison of putative human:chimpanzee orthologues indicates that CT genes located on chromosome X are diverging faster and are undergoing stronger diversifying selection than those on the autosomes or than a set of control genes on either chromosome X or autosomes. CONCLUSION: Given their high level of diversifying selection, we suggest that CT genes are primarily responsible for the observed rapid evolution of protein-coding genes on the X chromosome.
Resumo:
"The host-parasite relationship" is a vast and diverse research field which, despite huge human and financial input over many years, remains largely shrouded in mystery. Clearly, the adaptation of parasites to their different host species, and to the different environmental stresses that they represent, depends on interactions with, and responses to, various molecules of host and/or parasite origin. The schistosome genome project is a primary strategy to reach the goal; this systematic research project has successfully developed novel technologies for qualitative and quantitative characterization of schistosome genes and genome organization by extensive international collaboration between top quality laboratories. Schistosomes are a family of parasitic blood flukes (Phylum Platyhelminthes), which have seven pairs of autosomal chromosomes and one pair of sex chromosomes (ZZ for a male worm and ZW for a female), of a haploid genome size of 2.7x108 base pairs (Simpson et al. 1982). Schistosomes are ideal model organisms for the development of genome mapping strategies since they have a small genome size comparable to that of well-characterized model organisms such as Caenorhabditis elegans (100 Mb) and Drosophila (165 Mb), and contain functional genes with a high level of homology to the host mammalian genes. Here we summarize the current progress in the schistosome genome project, the information of 3,047 transcribed genes (Expressed Sequence Tags; EST), complete sets of cDNA and genomic DNA libraries (including YAC and cosmid libraries) with a mapping technique to the well defined schistosome chromosomes. The schistosome genome project will further identify and characterize the key molecules that are responsible for host-parasite adaptation, i.e., successful growth, development, maturation and reproduction of the parasite within its host in the near future
Resumo:
Strategies to construct the physical map of the Trypanosoma cruzi nuclear genome have to capitalize on three main advantages of the parasite genome, namely (a) its small size, (b) the fact that all chromosomes can be defined, and many of them can be isolated by pulse field gel electrophoresis, and (c) the fact that simple Southern blots of electrophoretic karyotypes can be used to map sequence tagged sites and expressed sequence tags to chromosomal bands. A major drawback to cope with is the complexity of T. cruzi genetics, that hinders the construction of a comprehensive genetic map. As a first step towards physical mapping, we report the construction and partial characterization of a T. cruzi CL-Brener genomic library in yeast artificial chromosomes (YACs) that consists of 2,770 individual YACs with a mean insert size of 365 kb encompassing around 10 genomic equivalents. Two libraries in bacterial artificial chromosomes (BACs) have been constructed, BACI and BACII. Both libraries represent about three genome equivalents. A third BAC library (BAC III) is being constructed. YACs and BACs are invaluable tools for physical mapping. More generally, they have to be considered as a common resource for research in Chagas disease
Resumo:
Random single pass sequencing of cDNA fragments, also known as generation of Expressed Sequence Tags (ESTs), has been highly successful in the study of the gene content of higher organisms, and forms an integral part of most genome projects, with the objective to identify new genes and targets for disease control and prevention and to generate mapping probes. In the Trypanosoma cruzi genome project, EST sequencing has also been a starting point, and here we report data on the first 797 sequences obtained, partly from a CL Brener epimastigote non-normalized library, partly on a normalized library. Only around 30% of the sequences obtained showed similarity with Genbank and dbEST databases, half of which with sequences already reported for T. cruzi.
Resumo:
The epidemiologic typing of bacterial pathogens can be applied to answer a number of different questions: in case of outbreak, what is the extent and mode of transmission of epidemic clone(s )? In case of long-term surveillance, what is the prevalence over time and the geographic spread of epidemic and endemic clones in the population? A number of molecular typing methods can be used to classify bacteria based on genomic diversity into groups of closely-related isolates (presumed to arise from a common ancestor in the same chain of transmission) and divergent, epidemiologically-unrelated isolates (arising from independent sources of infection). Ribotyping, IS-RFLP fingerprinting, macrorestriction analysis of chromosomal DNA and PCR-fingerprinting using arbitrary sequence or repeat element primers are useful methods for outbreak investigations and regional surveillance. Library typing systems based on multilocus sequence-based analysis and strain-specific probe hybridization schemes are in development for the international surveillance of major pathogens like Mycobacterium tuberculosis. Accurate epidemiological interpretation of data obtained with molecular typing systems still requires additional research on the evolution rate of polymorphic loci in bacterial pathogens.
Resumo:
Pendant ma thèse de doctorat, j'ai utilisé des espèces modèles, comme la souris et le poisson-zèbre, pour étudier les facteurs qui affectent l'évolution des gènes et leur expression. Plus précisément, j'ai montré que l'anatomie et le développement sont des facteurs clés à prendre en compte, car ils influencent la vitesse d'évolution de la séquence des gènes, l'impact sur eux de mutations (i.e. la délétion du gène est-elle létale ?), et leur tendance à se dupliquer. Où et quand il est exprimé impose à un gène certaines contraintes ou au contraire lui donne des opportunités d'évoluer. J'ai pu comparer ces tendances aux modèles classiques d'évolution de la morphologie, que l'on pensait auparavant refléter directement les contraintes s'appliquant sur le génome. Nous avons montré que les contraintes entre ces deux niveaux d'organisation ne peuvent pas être transférées simplement : il n'y a pas de lien direct entre la conservation du génotype et celle de phénotypes comme la morphologie. Ce travail a été possible grâce au développement d'outils bioinformatiques. Notamment, j'ai travaillé sur le développement de la base de données Bgee, qui a pour but de comparer l'expression des gènes entre différentes espèces de manière automatique et à large échelle. Cela implique une formalisation de l'anatomie, du développement et de concepts liés à l'homologie grâce à l'utilisation d'ontologies. Une intégration cohérente de données d'expression hétérogènes (puces à ADN, marqueurs de séquence exprimée, hybridations in situ) a aussi été nécessaire. Cette base de données est mise à jour régulièrement et disponible librement. Elle devrait contribuer à étendre les possibilités de comparaison de l'expression des gènes entre espèces pour des études d'évo-devo (évolution du développement) et de génomique. During my PhD, I used model species of vertebrates, such as mouse and zebrafish, to study factors affecting the evolution of genes and their expression. More precisely I have shown that anatomy and development are key factors to take into account, influencing the rate of gene sequence evolution, the impact of mutations (i.e. is the deletion of a gene lethal?), and the propensity of a gene to duplicate. Where and when genes are expressed imposes constraints, or on the contrary leaves them some opportunity to evolve. We analyzed these patterns in relation to classical models of morphological evolution in vertebrates, which were previously thought to directly reflect constraints on the genomes. We showed that the patterns of evolution at these two levels of organization do not translate smoothly: there is no direct link between the conservation of genotype and phenotypes such as morphology. This work was made possible by the development of bioinformatics tools. Notably, I worked on the development of the database Bgee, which aims at comparing gene expression between different species in an automated and large-scale way. This involves the formalization of anatomy, development, and concepts related to homology, through the use of ontologies. A coherent integration of heterogeneous expression data (microarray, expressed sequence tags, in situ hybridizations) is also required. This database is regularly updated and freely available. It should contribute to extend the possibilities for comparison of gene expression between species in evo-devo and genomics studies.
Resumo:
For many applications in population genetics, codominant simple sequence repeats (SSRs) may have substantial advantages over dominant anonymous markers such as amplified fragment length polymorphisms (AFLPs). In high polyploids, however, allele dosage of SSRs cannot easily be determined and alleles are not easily attributable to potentially diploidized loci. Here, we argue that SSRs may nonetheless be better than AFLPs for polyploid taxa if they are analyzed as effectively dominant markers because they are more reliable and more precise. We describe the transfer of SSRs developed for diploid Mercurialis huetii to the clonal dioecious M. perennis. Primers were tested on a set of 54 male and female plants from natural decaploid populations. Eight of 65 tested loci produced polymorphic fragments. Binary profiles from 4 different scoring routines were used to define multilocus lineages (MLLs). Allowing for fragment differences within 1 MLL, all analyses revealed the same 14 MLLs without conflicting with merigenet, sex, or plot assignment. For semiautomatic scoring, a combination of as few as 2 of the 4 most polymorphic loci resulted in unambiguous discrimination of clones. Our study demonstrates that microsatellite fingerprinting of polyploid plants is a cost efficient and reliable alternative to AFLPs, not least because fewer loci are required than for diploids.
Resumo:
Rho GTPases are conformational switches that control a wide variety of signaling pathways critical for eukaryotic cell development and proliferation. They represent attractive targets for drug design as their aberrant function and deregulated activity is associated with many human diseases including cancer. Extensive high-resolution structures (.100) and recent mutagenesis studies have laid the foundation for the design of new structure-based chemotherapeutic strategies. Although the inhibition of Rho signaling with drug-like compounds is an active area of current research, very little attention has been devoted to directly inhibiting Rho by targeting potential allosteric non-nucleotide binding sites. By avoiding the nucleotide binding site, compounds may minimize the potential for undesirable off-target interactions with other ubiquitous GTP and ATP binding proteins. Here we describe the application of molecular dynamics simulations, principal component analysis, sequence conservation analysis, and ensemble small-molecule fragment mapping to provide an extensive mapping of potential small-molecule binding pockets on Rho family members. Characterized sites include novel pockets in the vicinity of the conformationaly responsive switch regions as well as distal sites that appear to be related to the conformations of the nucleotide binding region. Furthermore the use of accelerated molecular dynamics simulation, an advanced sampling method that extends the accessible time-scale of conventional simulations, is found to enhance the characterization of novel binding sites when conformational changes are important for the protein mechanism.
Resumo:
This research report concerns about the post-doctoral activities, conducted betweenSeptember 2010 and March 2011 at the University Pompeu Fabra, Barcelona. It comes to identify the consequences of the convergence phenomenon on photojournalism.Thus, in a more general approach, the effort is to to recovery the structural elements of the convergence concept in journalism. It aims to map, as well, the current debates about the repositioning of photographic practices linked to the news produced in a widespread adoption of digital devices in contemporary workflow. It is also specified,the analysis of photographic collectives as a result of the convergence frameworkapplied to photojournalism; the debate on ways of funding; alternatives facing thealleged crisis of press photography and, finally, proposes to create qualifying stages ofdevelopment of photojournalism in the digital age as well as the proposition of hypotheses concerning the structure of the productive routines. In addition, we present three cases to be analyzed in order to explore and verify the occurrence ofcharacteristics that may identify the object of research in the state of practice. Finally,we work in a series of conclusions, revisiting the main hypotheses. With this strategy, ispossible to define an sequence of analysis capable of addressing the characteristics present in the studied cases and other ones in future, thus, be able to affirm this stage as a step, in the continuous historical course of photojournalism.
Resumo:
Bcl10, a caspase recruitment domain (CARD)-containing protein identified from a breakpoint in mucosa-associated lymphoid tissue (MALT) B lymphomas, is essential for antigen-receptor-mediated nuclear factor kappaB (NF-kappaB) activation in lymphocytes. We have identified a novel CARD-containing protein and interaction partner of Bcl10, named Carma1. Carma1 is predominantly expressed in lymphocytes and represents a new member of the membrane-associated guanylate kinase family. Carma1 binds Bcl10 via its CARD motif and induces translocation of Bcl10 from the cytoplasm into perinuclear structures. Moreover, expression of Carma1 induces phosphorylation of Bcl10 and activation of the transcription factor NF-kappaB. We propose that Carma1 is a crucial component of a novel Bcl10-dependent signaling pathway in T-cells that leads to the activation of NF-kappaB.
Resumo:
Genetic caste determination has been described in two populations of Pogonomyrmex harvester ants, each comprising a pair of interbreeding lineages. Queens mate with males of their own and of the alternate lineage and produce two types of diploid offspring, those fertilized by males of the queens' lineage which develop into queens and those fertilized by males of the other lineage which develop into workers. Each of the lineages has been shown to be itself of hybrid origin between the species Pogonomyrmex barbatus and Pogonomyrmex rugosus, which both have typical, environmentally determined caste differentiation. In a large scale genetic survey across 35 sites in Arizona, New Mexico and Texas, we found that genetic caste determination associated with pairs of interbreeding lineages occurred frequently (in 26 out of the 35 sites). Overall, we identified eight lineages with genetic caste determination that always co-occurred in the same complementary lineage pairs. Three of the four lineage pairs appear to have a common origin while their relationship with the fourth remains unclear. The level of genetic differentiation among these eight lineages was significantly higher than the differentiation between P. rugosus and P. barbatus, which questions the appropriate taxonomic status of these genetic lineages. In addition to being genetically isolated from one another, all lineages with genetic caste determination were genetically distinct from P. rugosus and P. barbatus, even when colonies of interbreeding lineages co-occurred with colonies of either putative parent at the same site. Such nearly complete reproductive isolation between the lineages and the species with environmental caste determination might prevent the genetic caste determination system to be swept away by gene flow.
Resumo:
BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.