43 resultados para Full-length Cdna

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large. numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, S'-end clusters identify regions that are potential promoters for 8637 known genes and S'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the sequencing and annotation of genomes and transcriptomes of several eukaryotes, the importance of noncoding RNA (ncRNA)-RNA molecules that are not translated to protein products-has become more evident. A subclass of ncRNA transcripts are encoded by highly regulated, multi-exon, transcriptional units, are processed like typical protein-coding mRNAs and are increasingly implicated in regulation of many cellular functions in eukaryotes. This study describes the identification of candidate functional ncRNAs from among the RIKEN mouse full-length cDNA collection, which contains 60,770 sequences, by using a systematic computational filtering approach. We initially searched for previously reported ncRNAs and found nine murine ncRNAs and homologs of several previously described nonmouse ncRNAs. Through our computational approach to filter artifact-free clones that lack protein coding potential, we extracted 4280 transcripts as the largest-candidate set. Many clones in the set had EST hits, potential CpG islands surrounding the transcription start sites, and homologies with the human genome. This implies that many candidates are indeed transcribed in a regulated manner. Our results demonstrate that ncRNAs are a major functional subclass of processed transcripts in mammals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Manual curation has long been held to be the gold standard for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an uninformative filter that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The number of mammalian transcripts identified by full-length cDNA projects and genome sequencing projects is increasing remarkably. Clustering them into a strictly nonredundant and comprehensive set provides a platform for functional analysis of the transcriptome and proteome, but the quality of the clustering and predictive usefulness have previously required manual curation to identify truncated transcripts and inappropriate clustering of closely related sequences. A Representative Transcript and Protein Sets (RTPS) pipeline was previously designed to identify the nonredundant and comprehensive set of mouse transcripts based on clustering of a large mouse full-length cDNA set (FANTOM2). Here we propose an alternative method that is more robust, requires less manual curation, and is applicable to other organisms in addition to mouse. RTPSs of human, mouse, and rat have been produced by this method and used for validation. Their comprehensiveness and quality are discussed by comparison with other clustering approaches. The RTPSs are available at ftp://fantom2.gsc.riken.go.jp/RTPS/. (C). 2004 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A full-length cDNA sequence coding for Echinococcus granulosus thioredoxin peroxidase (EgTPx) was isolated from a sheep strain protoscolex cDNA library by immunoscreening using a pool of sera from mice infected with oncospheres. EgTPx expressed as a fusion protein with glutathione S-transferase (GST) exhibited significant thiol-dependent peroxidase activity that protected plasmid DNA from damage by metal-catalyzed oxidation (MCO) in vitro. Furthermore, the suggested antioxidant role for EgTPx was reinforced in an in vivo assay, whereby its expression in BL21 bacterial cells markedly increased the tolerance and survival of the cells to high concentrations of H2O2 compared with controls. Immunolocalization studies revealed that EgTPx was specifically expressed in all tissues of the protoscolex and brood capsules. Higher intensity of labelling was detected in many, but not all, calcareous corpuscle cells in protoscoleces. The purified recombinant EgTPx protein was used to screen sera from heavily infected mice and patients with confirmed hydatid infection. Only a portion of the sera reacted positively with the EgTPx-GST fusion protein in Western blots, suggesting that EgTPx may form antibody-antigen complexes or that responses to the EgTPx antigen may be immunologically regulated. Recombinant EgTPx may prove useful for the screening of specific inhibitors that could serve as new drugs for treatment of hydatid disease. Moreover, given that TPx from different parasitic phyla were phylogenetically distant from host TPx molecules, the development of antiparasite TPx inhibitors that do not react with host TPx might be feasible. (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The snake venom group C prothrombin activators contain a number of components that enhance the rate of prothrombin activation. The cloning and expression of full-length cDNA for one of these components, an activated factor X (factor Xa)-like protease from Pseudonaja textilis as well as the generation of functional chimeric constructs with procoagulant activity were described. The complete cDNA codes for a propeptide, light chain, activation peptide (AP) and heavy chain related in sequence to mammalian factor X. Efficient expression of the protease was achieved with constructs where the AP was deleted and the cleavage sites between the heavy and light chains modified, or where the AP was replaced with a peptide involved in insulin receptor processing. In human kidney cells (H293F) transfected with these constructs, up to 80% of the pro-form was processed to heavy and light chains. Binding of the protease to barium citrate and use of specific antibodies demonstrated that gamma-carboxylation of glutamic acid residues had occurred on the light chain in both cases, as observed in human factor Xa and the native P. textilis protease. The recombinant protease caused efficient coagulation of whole citrated blood and citrated plasma that was enhanced by the presence of Ca2+. This study identified the complete cDNA sequence of a factor Xa-like protease from P. textilis and demonstrated for the first time the expression of a recombinant form of P. textilis protease capable of blood coagulation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

All single-stranded 'positive-sense' RNA viruses that infect mammalian, insect or plant cells rearrange internal cellular membranes to provide an environment facilitating virus replication. A striking feature of these unique membrane structures is the induction of 70-100 nm vesicles (either free within the cytoplasm, associated with other induced vesicles or bound within a surrounding membrane) harbouring the viral replication complex (RC). Although similar in appearance, the cellular composition of these vesicles appears to vary for different viruses, implying different organelle origins for the intracellular sites of viral RNA replication. Genetic analysis has revealed that induction of these membrane structures can be attributed to a particular viral gene product, usually a non-structural protein. This review will highlight our current knowledge of the formation and composition of virus RCs and describe some of the similarities and differences in RNA-membrane interactions observed between the virus families Flaviviridae and Picornaviridae.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent large-scale analyses of mainly full-length cDNA libraries generated from a variety of mouse tissues indicated that almost half of all representative cloned sequences did flat contain ail apparent protein-coding sequence, and were putatively derived from non-protein-coding RNA (ncRNA) genes. However, many of these clones were singletons and the majority were unspliced, raising the possibility that they may be derived from genomic DNA or unprocessed pre-rnRNA contamination during library construction, or alternatively represent nonspecific transcriptional noise. Here we Show, using reverse transcriptase-dependent PCR, microarray, and Northern blot analyses, that many of these clones were derived from genuine transcripts Of unknown function whose expression appears to be regulated. The ncRNA transcripts have larger exons and fewer introns than protein-coding transcripts. Analysis of the genomic landscape around these sequences indicates that some cDNA clones were produced not from terminal poly(A) tracts but internal priming sites within longer transcripts, only a minority of which is encompassed by known genes. A significant proportion of these transcripts exhibit tissue-specific expression patterns, as well as dynamic changes in their expression in macrophages following lipopolysaccharide Stimulation. Taken together, the data provide strong support for the conclusion that ncRNAs are an important, regulated component of the mammalian transcriptome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our previous studies using trans-complementation analysis of Kunjin virus (KUN) full-length cDNA clones harboring in-frame deletions in the NS3 gene demonstrated the inability of these defective complemented RNAs to be packaged into virus particles (W. J. Liu, P. L. Sedlak, N. Kondratieva, and A. A. Khromykh, J. Virol. 76:10766-10775). In this study we aimed to establish whether this requirement for NS3 in RNA packaging is determined by the secondary RNA structure of the NS3 gene or by the essential role of the translated NS3 gene product. Multiple silent mutations of three computer-predicted stable RNA structures in the NS3 coding region of KUN replicon RNA aimed at disrupting RNA secondary structure without affecting amino acid sequence did not affect RNA replication and packaging into virus-like particles in the packaging cell line, thus demonstrating that the predicted conserved RNA structures in the NS3 gene do not play a role in RNA replication and/or packaging. In contrast, double frameshift mutations in the NS3 coding region of full-length KUN RNA, producing scrambled NS3 protein but retaining secondary RNA structure, resulted in the loss of ability of these defective RNAs to be packaged into virus particles in complementation experiments in KUN replicon-expressing cells. Furthermore, the more robust complementation-packaging system based on established stable cell lines producing large amounts of complemented replicating NS3-deficient replicon RNAs and infection with KUN virus to provide structural proteins also failed to detect any secreted virus-like particles containing packaged NS3-deficient replicon RNAs. These results have now firmly established the requirement of KUN NS3 protein translated in cis for genome packaging into virus particles.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleoticle distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ataxia-oculomotor apraxia (AOA1) is a neurological disorder with symptoms that overlap those of ataxia-telangiectasia, a syndrome characterized by abnormal responses to double-strand DNA breaks and genome instability. The gene mutated in AOA1, APTX, is predicted to code for a protein called aprataxin that contains domains of homology with proteins involved in DNA damage signalling and repair. We demonstrate that aprataxin is a nuclear protein, present in both the nucleoplasm and the nucleolus. Mutations in the APTX gene destabilize the aprataxin protein, and fusion constructs of enhanced green fluorescent protein and aprataxin, representing deletions of putative functional domains, generate highly unstable products. Cells from AOA1 patients are characterized by enhanced sensitivity to agents that cause single-strand breaks in DNA but there is no evidence for a gross defect in single-strand break repair. Sensitivity to hydrogen peroxide and the resulting genome instability are corrected by transfection with full-length aprataxin cDNA. We also demonstrate that aprataxin interacts with the repair proteins XRCC1, PARP-1 and p53 and that it co-localizes with XRCC1 along charged particle tracks on chromatin. These results demonstrate that aprataxin influences the cellular response to genotoxic stress very likely by its capacity to interact with a number of proteins involved in DNA repair.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As part of a comparative mapping study between sugarcane and sorghum, a sugarcane cDNA clone with homology to the maize Rp1-D rust resistance gene was mapped in sorghum. The cDNA probe hybridised to multiple loci, including one on sorghum linkage group (LG) E in a region where a major rust resistance QTL had been previously mapped. Partial sorghum Rp1-D homologues were isolated from genomic DNA of rust-resistant and -susceptible progeny selected from a sorghum mapping population. Sequencing of the Rp1-D homologues revealed five discrete sequence classes: three from resistant progeny and two from susceptible progeny. PCR primers specific to each sequence class were used to amplify products from the progeny and confirmed that the five sequence classes mapped to the same locus on LG E. Cluster analysis of these sorghum sequences and available sugarcane, maize and sorghum Rp1-D homologue sequences showed that the maize Rp1-D sequence and the partial sugarcane Rp1-D homologue were clustered with one of the sorghum resistant progeny sequence classes, while previously published sorghum Rp1-D homologue sequences clustered with the susceptible progeny sequence classes. Full-length sequence information was obtained for one member of a resistant progeny sequence class (Rp1-SO) and compared with the maize Rp1-D sequence and a previously identified sorghum Rp1 homologue (Rph1-2). There was considerable similarity between the two sorghum sequences and less similarity between the sorghum and maize sequences. These results suggest a conservation of function and gene sequence homology at the Rp1 loci of maize and sorghum and provide a basis for convenient PCR-based screening tools for putative rust resistance alleles in sorghum.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Liver fatty acid binding protein (L-FABP) contains amino acids that are known to possess antioxidant function. In this study, we tested the hypothesis that L-FABP may serve as an effective endogenous cytoprotectant against oxidative stress. Chang liver cells were selected as the experimental model because of their undetectable L-FABP mRNA level. Full-length L-FABP cDNA was subcloned into the mammalian expression vector pcDNA3.1 (pcDNA-FABP). Chang cells were stably transfected with pc-DNA-FABP or vector (pcDNA3.1) alone. Oxidative stress was induced by incubating cells with 400 mu mol/L H2O2 or by subjecting cells to hypoxia/reoxygenation. Total cellular reactive oxygen species (ROS) was determined using the fluorescent probe DCF. Cellular damage induced by hypoxia/reoxygenation was assayed by lactate dehydrogenase (LDH) release. Expression of L-FABP was documented by regular reverse transcription polyrnerase chain reaction (RT-PCR), real-time RT-PCR, and Western blot. The pcDNA-FABP-transfected cells expressed full-length L-FABP mRNA, which was absent from vector-transfected control cells. Western blot showed expression of 14-kd L-FABP protein in pcDNA-FABP-transfected cells, but not in vector-transfected cells. Transfected cells showed decreased DCF fluorescence intensity under oxidative stress (H2O2 and hypoxia/reoxygenation) conditions versus control in inverse proportion to the level of L-FABP expression. Lower LDH release was observed in the higher L-FABP-expressed cells in hypoxia/reoxygenation experiments. In conclusion, we successfully transfected and cloned a Chang liver cell line that expressed the L-FABP gene. The L-FABP-expressing cell line had a reduced intracellular ROS level versus control. This finding implies that L-FABP has a significant role in oxidative stress.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Non- protein- coding RNAs ( ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25- nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases ( Kb), respectively. We surveyed the 102,801 FANTOM3 mouse cDNA clones and found that Air and Xist were present not as single, full- length transcripts but as a cluster of multiple, shorter cDNAs, which were unspliced, had little coding potential, and were most likely primed from internal adenine- rich regions within longer parental transcripts. We therefore conducted a genome- wide search for regional clusters of such cDNAs to find novel macro ncRNA candidates. Sixty- six regions were identified, each of which mapped outside known protein- coding loci and which had a mean length of 92 Kb. We detected several known long ncRNAs within these regions, supporting the basic rationale of our approach. In silico analysis showed that many regions had evidence of imprinting and/ or antisense transcription. These regions were significantly associated with microRNAs and transcripts from the central nervous system. We selected eight novel regions for experimental validation by northern blot and RT- PCR and found that the majority represent previously unrecognized noncoding transcripts that are at least 10 Kb in size and predominantly localized in the nucleus. Taken together, the data not only identify multiple new ncRNAs but also suggest the existence of many more macro ncRNAs like Xist and Air.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Craniofacial anomalies are a common feature of human congenital dysmorphology syndromes, suggesting that genes expressed in the developing face are likely to play a wider role in embryonic development. To facilitate the identification of genes involved in embryogenesis, we previously constructed an enriched cDNA library by subtracting adult mouse liver cDNA from that of embryonic day (E)10.5 mouse pharyngeal arch cDNA. From this library, 273 unique clones were sequenced and known proteins binned into functional categories in order to assess enrichment of the library (1). We have now selected 31 novel and poorly characterised genes from this library and present bioinformatic analysis to predict proteins encoded by these genes, and to detect evolutionary conservation. Of these genes 61% (19/31) showed restricted expression in the developing embryo, and a subset of these was chosen for further in silico characterisation as well as experimental determination of subcellular localisation based on transient transfection of predicted full-length coding sequences into mammalian cell lines. Where a human orthologue of these genes was detected, chromosomal localisation was determined relative to known loci for human congenital disease.