10 resultados para HUMAN TRANSCRIPTOME
em University of Queensland eSpace - Australia
Resumo:
Large numbers of noncoding RNA transcripts (ncRNAS) are being revealed by complementary DNA cloning and genome tiling array studies in animals. The big and as yet largely unanswered question is whether these transcripts are relevant. A paper by Willingham et al. shows the way forward by developing a strategy for large-scale functional screening of ncRNAs, involving small interfering RNA knockdowns in cell-based screens, which identified a previously unidentified ncRNA repressor of the transcription factor NFAT. It appears likely that ncRNAs constitute a critical hidden layer of gene regulation in complex organisms, the understanding of which requires new approaches in functional genomics.
Resumo:
The mammalian transcriptome contains many nonprotein-coding RNAs (ncRNAs), but most of these are of unclear significance and lack strong sequence conservation, prompting suggestions that they might be non-functional. However, certain long functional ncRNAs such as Air and Xist are also poorly conserved. In this article, we systematically analyzed the conservation of several groups of functional ncRNAs, including miRNAs, snoRNAs and longer ncRNAs whose function has been either documented or confidently predicted. As expected, miRNAs and snoRNAs were highly conserved. By contrast, the longer functional non-micro, non-sno ncRNAs were much less conserved with many displaying rapid sequence evolution. Our findings suggest that longer ncRNAs are under the influence of different evolutionary constraints and that the lack of conservation displayed by the thousands of candidate ncRNAs does not necessarily signify an absence of function.
Resumo:
Increasing evidence suggests that the development and function of the nervous system is heavily dependent on RNA editing and the intricate spatiotemporal expression of a wide repertoire of non-coding RNAs, including micro RNAs, small nucleolar RNAs and longer non-coding RNAs. Non-coding RNAs may provide the key to understanding the multi-tiered links between neural development, nervous system function, and neurological diseases.
Resumo:
The number of mammalian transcripts identified by full-length cDNA projects and genome sequencing projects is increasing remarkably. Clustering them into a strictly nonredundant and comprehensive set provides a platform for functional analysis of the transcriptome and proteome, but the quality of the clustering and predictive usefulness have previously required manual curation to identify truncated transcripts and inappropriate clustering of closely related sequences. A Representative Transcript and Protein Sets (RTPS) pipeline was previously designed to identify the nonredundant and comprehensive set of mouse transcripts based on clustering of a large mouse full-length cDNA set (FANTOM2). Here we propose an alternative method that is more robust, requires less manual curation, and is applicable to other organisms in addition to mouse. RTPSs of human, mouse, and rat have been produced by this method and used for validation. Their comprehensiveness and quality are discussed by comparison with other clustering approaches. The RTPSs are available at ftp://fantom2.gsc.riken.go.jp/RTPS/. (C). 2004 Elsevier Inc. All rights reserved.
Resumo:
The chromodomain is 40-50 amino acids in length and is conserved in a wide range of chromatic and regulatory proteins involved in chromatin remodeling. Chromodomain-containing proteins can be classified into families based on their broader characteristics, in particular the presence of other types of domains, and which correlate with different subclasses of the chromodomains themselves. Hidden Markov model (HMM)-generated profiles of different subclasses of chromodomains were used here to identify sequences encoding chromodomain-containing proteins in the mouse transcriptome and genome. A total of 36 different loci encoding proteins containing chromodomains, including 17 novel loci, were identified. Six of these loci (including three apparent pseudogenes, a novel HP1 ortholog, and two novel Msl-3 transcription factor-like proteins) are not present in the human genome, whereas the human genome contains four loci (two CDY orthologs and two apparent CDY pseuclogenes) that are not present in mouse. A number of these loci exhibit alternative splicing to produce different isoforms, including 43 novel variants, some of which lack the chromodomain. The likely functions of these proteins are discussed in relation to the known functions of other chromodomain-containing proteins within the same family.
Resumo:
We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large. numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, S'-end clusters identify regions that are potential promoters for 8637 known genes and S'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
Resumo:
We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleoticle distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.
Resumo:
Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring genes in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
Resumo:
Proteins secreted by and anchored on the surfaces of parasites are in intimate contact with host tissues. The transcriptome of infective cercariae of the blood fluke, Schistosoma mansoni, was screened using signal sequence trap to isolate cDNAs encoding predicted proteins with an N-terminal signal peptide. Twenty cDNA fragments were identified, most of which contained predicted signal peptides or transmembrane regions, including a novel putative seven-transmembrane receptor and a membrane-associated mitogen-activated protein kinase. The developmental expression pattern within different life-cycle stages ranged from ubiquitous to a transcript that was highly upregulated in the cercaria. A bioinformatics-based comparison of 100 signal peptides from each of schistosomes, humans, a parasitic nematode and Escherichia coli showed that differences in the sequence composition of signal peptides, notably the residues flanking the predicted cleavage site, might account for the negative bias exhibited in the processing of schistosome signal peptides in mammalian cells. (c) 2005 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.