903 resultados para Human genome - Theses
Resumo:
The central dogma of biology holds that genetic information normally flows from DNA to RNA to protein. As a consequence it has been generally assumed that genes generally code for proteins, and that proteins fulfil not only most structural and catalytic but also most regulatory functions, in all cells, from microbes to mammals. However, the latter may not be the case in complex organisms. A number of startling observations about the extent of non-protein-coding RNA (ncRNA) transcription in the higher eukaryotes and the range of genetic and epigenetic phenomena that are RNA-directed suggests that the traditional view of the structure of genetic regulatory systems in animals and plants may be incorrect. ncRNA dominates the genomic output of the higher organisms and has been shown to control chromosome architecture, mRNA turnover and the developmental timing of protein expression, and may also regulate transcription and alternative splicing. This paper re-examines the available evidence and suggests a new framework for considering and understanding the genomic programming of biological complexity, autopoletic development and phenotypic variation. BioEssays 25:930-939,2003. (C) 2003 Wiley Periodicals, Inc.
Resumo:
The chromodomain is 40-50 amino acids in length and is conserved in a wide range of chromatic and regulatory proteins involved in chromatin remodeling. Chromodomain-containing proteins can be classified into families based on their broader characteristics, in particular the presence of other types of domains, and which correlate with different subclasses of the chromodomains themselves. Hidden Markov model (HMM)-generated profiles of different subclasses of chromodomains were used here to identify sequences encoding chromodomain-containing proteins in the mouse transcriptome and genome. A total of 36 different loci encoding proteins containing chromodomains, including 17 novel loci, were identified. Six of these loci (including three apparent pseudogenes, a novel HP1 ortholog, and two novel Msl-3 transcription factor-like proteins) are not present in the human genome, whereas the human genome contains four loci (two CDY orthologs and two apparent CDY pseuclogenes) that are not present in mouse. A number of these loci exhibit alternative splicing to produce different isoforms, including 43 novel variants, some of which lack the chromodomain. The likely functions of these proteins are discussed in relation to the known functions of other chromodomain-containing proteins within the same family.
Resumo:
The C2 domain is one of the most frequent and widely distributed calcium-binding motifs. Its structure comprises an eight-stranded beta-sandwich with two structural types as if the result of a circular permutation. Combining sequence, structural and modelling information, we have explored, at different levels of granularity, the functional characteristics of several families of C2 domains. At the coarsest level,the similarity correlates with key structural determinants of the C2 domain fold and, at the finest level, with the domain architecture of the proteins containing them, highlighting the functional diversity between the various subfamilies. The functional diversity appears as different conserved surface patches throughout this common fold. In some cases, these patches are related to substrate-binding sites whereas in others they correspond to interfaces of presumably permanent interaction between other domains within the same polypeptide chain. For those related to substrate-binding sites, the predictions overlap with biochemical data in addition to providing some novel observations. For those acting as protein-protein interfaces' our modelling analysis suggests that slight variations between families are a result of not only complementary adaptations in the interfaces involved but also different domain architecture. In the light of the sequence and structural genomic projects, the work presented here shows that modelling approaches along with careful sub-typing of protein families will be a powerful combination for a broader coverage in proteomics. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
With the sequencing and annotation of genomes and transcriptomes of several eukaryotes, the importance of noncoding RNA (ncRNA)-RNA molecules that are not translated to protein products-has become more evident. A subclass of ncRNA transcripts are encoded by highly regulated, multi-exon, transcriptional units, are processed like typical protein-coding mRNAs and are increasingly implicated in regulation of many cellular functions in eukaryotes. This study describes the identification of candidate functional ncRNAs from among the RIKEN mouse full-length cDNA collection, which contains 60,770 sequences, by using a systematic computational filtering approach. We initially searched for previously reported ncRNAs and found nine murine ncRNAs and homologs of several previously described nonmouse ncRNAs. Through our computational approach to filter artifact-free clones that lack protein coding potential, we extracted 4280 transcripts as the largest-candidate set. Many clones in the set had EST hits, potential CpG islands surrounding the transcription start sites, and homologies with the human genome. This implies that many candidates are indeed transcribed in a regulated manner. Our results demonstrate that ncRNAs are a major functional subclass of processed transcripts in mammals.
Resumo:
The current RIKEN transcript set represents a significant proportion of the mouse transcriptome but transcripts expressed in the innate and acquired immune systems are poorly represented. In the present study we have assessed the complexity of the transcriptome expressed in mouse macrophages before and after treatment with lipopolysaccharide, a global regulator of macrophage gene expression, using existing RIKEN 19K arrays. By comparison to array profiles of other cells and tissues, we identify a large set of macrophage-enriched genes, many of which have obvious functions in endocytosis and phagocytosis. In addition, a significant number of LPS-inducible genes were identified. The data suggest that macrophages are a complex source of mRNA for transcriptome studies. To assess complexity and identify additional macrophage expressed genes, cDNA libraries were created from purified populations of macrophage and dendritic cells, a functionally related cell type. Sequence analysis revealed a high incidence of novel mRNAs within these cDNA libraries. These studies provide insights into the depths of transcriptional complexity still untapped amongst products of inducible genes, and identify macrophage and dendritic cell populations as a starting point for sampling the inducible mammalian transcriptome.
Resumo:
We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleoticle distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.
Resumo:
The geographically constrained distribution of Epstein-Barr virus (EBV)-associated nasopharyngeal carcinoma (NPC) in southeast Asian populations suggests that both viral and host genetics may influence disease risk. Although susceptibility loci have been mapped within the human genome, the role of viral genetics in the focal distribution of NPC remains an enigma. Here we report a molecular phylogenetic analysis of an NPC-associated viral oncogene, LMP1, in a large panel of EBV isolates from southeast Asia and from Papua New Guinea, Africa, and Australia, regions of the world where NPC is and is not endemic, respectively. This analysis revealed that LMP1 sequences show a distinct geographic structure, indicating that the southeast Asian isolates have evolved as a lineage distinct from those of Papua New Guinea, African, and Australian isolates. Furthermore, a likelihood ratio test revealed that the C termini of the LMP1 sequences of the southeast Asian lineage are under significant positive selection pressure, particularly at some sites within the C-terminal activator regions. We also present evidence that although the N terminus and transmembrane region of LMP1 have undergone recombination, the C-terminal region of the gene has evolved without any history of recombination. Based on these observations, we speculate that selection pressure may be driving the LMP1 sequences in virus isolates from southeast Asia towards a more malignant phenotype, thereby influencing the endemic distribution of NPC in this region.
Resumo:
The EF-hand superfamily of calcium binding proteins includes the S100, calcium binding protein, and troponin subfamilies. This study represents a genome, structure, and expression analysis of the S100 protein family, in mouse, human, and rat. We confirm the high level of conservation between mammalian sequences but show that four members, including S100A12, are present only in the human genome. We describe three new members of the S100 family in the three species and their locations within the S100 genomic clusters and propose a revised nomenclature and phylogenetic relationship between members of the EF-hand superfamily. Two of the three new genes were induced in bone-marrow-derived macrophages activated with bacterial lipopolysaccharide, suggesting a role in inflammation. Normal human and murine tissue distribution profiles indicate that some members of the family are expressed in a specific manner, whereas others are more ubiquitous. Structure-function analysis of the chemotactic properties of murine S100A8 and human S100A12, particularly within the active hinge domain, suggests that the human protein is the functional homolog of the murine protein. Strong similarities between the promoter regions of human S100A12 and murine S100A8 support this possibility. This study provides insights into the possible processes of evolution of the EF-hand protein superfamily. Evolution of the S100 proteins appears to have occurred in a modular fashion, also seen in other protein families such as the C2H2-type zinc-finger family. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Alternative splicing is widespread in mammalian gene expression, and variant splice patterns are often specific to different stages of development, particular tissues or a disease state. There is a need to systematically collect data on alternatively spliced exons, introns and splice isoforms, and to annotate this data. The Alternative Splicing Database consortium has been addressing this need, and is committed to maintaining and developing a value-added database of alternative splice events, and of experimentally verified regulatory mechanisms that mediate splice variants. In this paper we present two of the products from this project: namely, a database of computationally delineated alternative splice events as seen in alignments of EST/cDNA sequences with genome sequences, and a database of alternatively spliced exons collected from literature. The reported splice events are from nine different organisms and are annotated for various biological features including expression states and cross-species conservation. The data are presented on our ASD web pages (http://www.ebi.ac.uk/asd).
Resumo:
Do non-coding RNAs that are derived from the introns and exons of protein-coding and non-protein-coding genes represent a fundamental advance in the genetic operating system of higher organisms? Recent evidence from comparative genomics and molecular genetics indicates that this might be the case. If so, there will be profound consequences for our understanding of the genetics of these organisms, and in particular how the trajectories of differentiation and development and the differences among individuals and species are genomically programmed. But how might this hypothesis be tested?
Resumo:
Modern toxicology investigates a wide array of both old and new health hazards. Priority setting is needed to select agents for research from the plethora of exposure circumstances. The changing societies and a growing fraction of the aged have to be taken into consideration. A precise exposure assessment is of importance for risk estimation and regulation. Toxicology contributes to the exploration of pathomechanisms to specify the exposure metrics for risk estimation. Combined effects of co-existing agents are not yet sufficiently understood. Animal experiments allow a separate administration of agents which can not be disentangled by epidemiological means, but their value is limited for low exposure levels in many of today's settings. As an experimental science, toxicology has to keep pace with the rapidly growing knowledge about the language of the genome and the changing paradigms in cancer development. During the pioneer era of assembling a working draft of the human genome, toxicogenomics has been developed. Gene and pathway complexity have to be considered when investigating gene-environment interactions. For a best conduct of studies, modem toxicology needs a close liaison with many other disciplines like epidemiology and bioinformatics. (C) 2004 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring genes in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
Resumo:
In recent years, there have been increasing numbers of transcripts identified that do not encode proteins, many of which are developmentally regulated and appear to have regulatory functions. Here, we describe the construction of a comprehensive mammalian noncoding RNA database (RNAdb) which contains over 800 unique experimentally studied noncoding RNAs (ncRNAs), including many associated with diseases and/or developmental processes. The database is available at http://research.imb.uq. edu.au/RNAdb and is searchable by many criteria. It includes microRNAs and snoRNAs, but not infrastructural RNAs, such as rRNAs and tRNAs, which are catalogued elsewhere. The database also includes over 1100 putative antisense ncRNAs and almost 20000 putative ncRNAs identified in high-quality murine and human cDNA libraries, with more to be added in the near future. Many of these RNAs are large, and many are spliced, some alternatively. The database will be useful as a foundation for the emerging field of RNomics and the characterization of the roles of ncRNAs in mammalian gene expression and regulation.
Resumo:
Hyaluronic acid (HA) is a commercially valuable medical biopolymer increasingly produced through microbial fermentation. Viscosity limits product yield and the focus of research and development has been on improving the key quality parameters, purity and molecular weight. Traditional strain and process optimisation has yielded significant improvements, but appears to have reached a limit. Metabolic engineering is providing new opportunities and HA produced in a heterologous host is about to enter the market. In order to realise the full potential of metabolic engineering, however, greater understanding of the mechanisms underlying chain termination is required.
Resumo:
The aim of the study was to perform a genetic linkage analysis for eye color, for comparative data. Similarity in eye color of mono- and dizygotic twins was rated by the twins' mother, their father and/or the twins themselves. For 4748 twin pairs the similarity in eye color was available on a three point scale (not at all alike-somewhat alike-completely alike), absolute eye color on individuals was not assessed. The probability that twins were alike for eye color was calculated as a weighted average of the different responses of all respondents on several different time points. The mean probability of being alike for eye color was 0.98 for MZ twins (2167 pairs), whereas the mean probability for DZ twins was 0.46 (2537 pairs), suggesting very high heritability for eye color. For 294 DZ twin pairs genome-wide marker data were available. The probability of being alike for eye color was regressed on the average amount of IBD sharing. We found a peak LOD-score of 2.9 at chromosome 15q, overlapping with the region recently implicated for absolute ratings of eye color in Australian twins [Zhu, G., Evans, D. M., Duffy, D. L., Montgomery, G. W., Medland, S. E., Gillespie, N. A., Ewen, K. R., Jewell, M., Liew, Y. W., Hayward, N. K., Sturm, R. A., Trent, J. M., and Martin, N. G. (2004). Twin Res. 7:197-210] and containing the OCA2 gene, which is the major candidate gene for eye color [Sturm, R. A. Teasdale, R. D, and Box, N. F. (2001). Gene 277:49-62]. Our results demonstrate that comparative measures on relatives can be used in genetic linkage analysis.