935 resultados para Full-length Cdnas
Resumo:
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Resumo:
The number of sequences generated by genome projects has increased exponentially, but gene characterization has not followed at the same rate. Sequencing and analysis of full-length cDNAs is an important step in gene characterization that has been used nowadays by several research groups. In this work, we have selected Schistosoma mansoni clones for full-length sequencing, using an algorithm that investigates the presence of the initial methionine in the parasite sequence based on the positions of alignment start between two sequences. BLAST searches to produce such alignments have been performed using parasite expressed sequence tags produced by Minas Gerais Genome Network against sequences from the database Eukaryotic Cluster of Orthologous Groups (KOG). This procedure has allowed the selection of clones representing 398 proteins which have not been deposited as S. mansoni complete CDS in any public database. Dedicated sequencing of 96 of such clones with reads from both 5' and 3' ends has been performed. These reads have been assembled using PHRAP, resulting in the production of 33 full-length sequences that represent novel S. mansoni proteins. These results shall contribute to construct a more complete view of the biology of this important parasite.
Resumo:
The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
Resumo:
A human interleukin 4 (hIL-4)-encoding cDNA (hIL4) probe was used to screen a bovine genomic library, and three clones containing sequences with homology to the human and mouse IL4 cDNAs were isolated. Sequence information obtained from one of these genomic clones was used to design an oligodeoxyribonucleotide primer corresponding to the transcription start point region for use in the polymerase chain reaction (PCR). The PCR-RACE protocol, designed for the rapid amplification of cDNA ends, was successfully used to generate a full-length bovine IL4 (bIL4) cDNA clone from polyadenylated RNA isolated from concanavalin A-stimulated bovine lymph node cells. The bIL4 cDNA is 570 bp in length and contains an open reading frame of 405 nucleotides (nt), coding for a 15.1-kDa precursor of 135 amino acids (aa), which should be reduced to 12.6 kDa for unglycosylated bIL4 after cleavage of a putative hydrophobic leader sequence of 24 aa. The aa sequence contains one possible Asn-linked glycosylation site. Bovine IL4 is shorter than mouse (mIL4) and hIL4, because of a 51-nt deletion in the coding region. Comparison of the overall nt and deduced aa sequences shows a greater homology of bIL4 with hIL4 than with mIL4. This homology is not evenly distributed, however, with the nt sequences 5' and 3' of the coding region showing a much greater homology between all three species than the coding sequence.
Resumo:
Transglutaminases are a family of enzymes that catalyze the covalent cross-linking of proteins through the formation of $\varepsilon$-($\gamma$-glutaminyl)-lysyl isopeptide bonds. Tissue transglutaminase (Tgase) is an intracellular enzyme which is expressed in terminally differentiated and senescent cells and also in cells undergoing apoptotic cell death. To characterize this enzyme and examine its relationship with other members of the transglutaminase family, cDNAs, the first two exons of the gene and 2 kb of the 5$\sp\prime$ flanking region, including the promoter, were isolated. The full length Tgase transcript consists of 66 bp of 5$\sp\prime$-UTR (untranslated) sequence, an open reading frame which encodes 686 amino acids and 1400 bp of 3$\sp\prime$-UTR sequence. Alignment of the deduced Tgase protein sequence with that of other transglutaminases revealed regions of strong homology, particularly in the active site region.^ The Tgase cDNA was used to isolate and characterize a genomic clone encompassing the 5$\sp\prime$ end of the mouse Tgase gene. The transcription start site was defined using genomic and cDNA clones coupled with S1 protection analysis and anchored PCR. This clone includes 2.3 kb upstream of the transcription start site and two exons that contain the first 256 nucleotides of the mouse Tgase cDNA sequence. The exon intron boundaries have been mapped and compared with the exon intron boundaries of three members of the transglutaminase family: human factor XIIIa, the human keratinocyte transglutaminase and human erythrocyte band 4.1. Tissue Tgase exon II is similar to comparable exons of these genes. However, exon I bears no resemblance with any of the other transglutaminase amino terminus exons.^ Previous work in our laboratory has shown that the transcription of the Tgase gene is directly controlled by retinoic acid and retinoic acid receptors. To identify the region of the Tgase gene responsible for regulating its expression, fragments of the Tgase promoter and 5$\sp\prime$-flanking region were cloned into the chloramphenicol actetyl transferase (CAT) reporter constructs. Transient transfection experiments with these constructs demonstrated that the upstream region of Tgase is a functional promoter which contains a retinoid response element within a 1573 nucleotide region spanning nucleotides $-$252 to $-$1825. ^
Resumo:
We have systematically characterized gene expression patterns in 49 adult and embryonic mouse tissues by using cDNA microarrays with 18,816 mouse cDNAs. Cluster analysis defined sets of genes that were expressed ubiquitously or in similar groups of tissues such as digestive organs and muscle. Clustering of expression profiles was observed in embryonic brain, postnatal cerebellum, and adult olfactory bulb, reflecting similarities in neurogenesis and remodeling. Finally, clustering genes coding for known enzymes into 78 metabolic pathways revealed a surprising coordination of expression within each pathway among different tissues. On the other hand, a more detailed examination of glycolysis revealed tissue-specific differences in profiles of key regulatory enzymes. Thus, by surveying global gene expression by using microarrays with a large number of elements, we provide insights into the commonality and diversity of pathways responsible for the development and maintenance of the mammalian body plan.
Resumo:
The genetic study of RNA viruses is greatly facilitated by the availability of infectious cDNA clones. However, their construction has often been difficult. While exploring ways to simplify the construction of infectious clones, we have successfully modified and applied the newly described technique of "long PCR" to the synthesis of a full-length DNA amplicon from the RNA of a cytopathogenic mutant (HM 175/24a) of the hepatitis A virus (HAV). Primers were synthesized to match the two extremities of the HAV genome. The antisense primer, homologous to the 3' end, was used in both the reverse transcription (RT) and the PCR steps. With these primers we reproducibly obtained a full-length amplicon of approximately 7.5 kb. Further, since we engineered a T7 promoter in the sense primer, RNA could be transcribed directly from the amplicon with T7 RNA polymerase. Following transfection of cultured fetal rhesus kidney cells with the transcription mixture containing both the HAV cDNA and the transcribed RNA, replicating HAV was detected by immunofluorescence microscopy and, following passage to other cell cultures, by focus formation. The recovered virus displayed the cytopathic effect and large plaque phenotype typical of the original virus; this result highlights the fidelity of the modified long reverse transcription-PCR procedure and demonstrates the potential of this method for providing cDNAs of viral genomes and simplifying the construction of infectious clones.
Resumo:
We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large. numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, S'-end clusters identify regions that are potential promoters for 8637 known genes and S'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
Resumo:
Background: The genetic diversity of the human immunodeficiency virus type 1 (HIV-1) is critical to lay the groundwork for the design of successful drugs or vaccine. In this study we aimed to characterize and define the molecular prevalence of HIV-1 subclade F1 currently circulating in Sao Paulo, Brazil. Methods: A total of 36 samples were selected from 888 adult patients residing in Sao Paulo who had previously been diagnosed in two independent studies in our laboratory as being infected with subclade F1 based on pol subgenomic fragment sequencing. Proviral DNA was amplified from the purified genomic DNA of all 36 blood samples by 5 fragments overlapping PCR followed by direct sequencing. Sequence data were obtained from the 5 fragments of pure subclade F1 and phylogenetic trees were constructed and compared with previously published sequences. Subclades F1 that exhibited mosaic structure with other subtypes were omitted from any further analysis Results: Our methods of fragment amplification and sequencing confirmed that only 5 sequences inferred from pol region as subclade F1 also holds true for the genome as a whole and, thus, estimated the true prevalence at 0.56%. The results also showed a single phylogenetic cluster of the Brazilian subclade F1 along with non-Brazilian South American isolates in both subgenomic and the full-length genomes analysis with an overall intrasubtype nucleotide divergence of 6.9%. The nucleotide differences within the South American and Central African F1 strains, in the C2-C3 env, were 8.5% and 12.3%, respectively. Conclusion: All together, our findings showed a surprisingly low prevalence rate of subclade F1 in Brazil and suggest that these isolates originated in Central Africa and subsequently introduced to South America.
Resumo:
To identify novel cytokine-related genes, we searched the set of 60,770 annotated RIKEN mouse cDNA clones (FANTOM2 clones), using keywords such as cytokine itself or cytokine names (such as interferon, interleukin, epidermal growth factor, fibroblast growth factor, and transforming growth factor). This search produced 108 known cytokines and cytokine-related products such as cytokine receptors, cytokine-associated genes, or their products (enhancers, accessory proteins, cytokine-induced genes). We found 15 clusters of FANTOM2 clones that are candidates for novel cytokine-related genes. These encoded products with strong sequence similarity to guanylate-binding protein (GBP-5), interleukin-1 receptor-associated kinase 2 (IRAK-2), interleukin 20 receptor alpha isoform 3, a member of the interferon-inducible proteins of the Ifi 200 cluster, four members of the membrane-associated family 1-8 of interferon-inducible proteins, one p27-like protein, and a hypothetical protein containing a Toll/Interleukin receptor domain. All four clones representing novel candidates of gene products from the family contain a novel highly conserved cross-species domain. Clones similar to growth factor-related products included transforming growth factor beta-inducible early growth response protein 2 (TIEG-2), TGFbeta-induced factor 2, integrin beta-like 1, latent TGF-binding protein 4S, and FGF receptor 4B. We performed a detailed sequence analysis of the candidate novel genes to elucidate their likely functional properties.
Resumo:
The majority of common diseases such as cancer, allergy, diabetes, or heart disease are characterized by complex genetic traits, in which genetic and environmental components contribute to disease susceptibility. Our knowledge of the genetic factors underlying most of such diseases is limited. A major goal in the post-genomic era is to identify and characterize disease susceptibility genes and to use this knowledge for disease treatment and prevention. More than 500 genes are conserved across the invertebrate and vertebrate genomes. Because of gene conservation, various organisms including yeast, fruitfly, zebrafish, rat, and mouse have been used as genetic models.
Resumo:
In order to study whether flavivirus RNA packaging is dependent on RNA replication, we generated two DNA-based Kunjin virus constructs, pKUN1 and pKUN1dGDD, allowing continuous production of replicating (wild-type) and nonreplicating (with a deletion of the NS5 gene RNA-polymerase motif GDD) full-length Kunjin virus RNAs, respectively, via nuclear transcription by cellular RNA polymerase II. As expected, transfection of pKUN1 plasmid DNA into BHK cells resulted in the recovery of secreted infectious Kunjin virions. Transfection of pKUN1dGDD DNA into BHK cells, however, did not result in the recovery of any secreted virus particles containing encapsidated dGDD RNA, despite an apparent accumulation of this RNA in cells demonstrated by Northern blot analysis and its efficient translation demonstrated by detection of correctly processed labeled structural proteins (at least prM and E) both in cells and in the culture fluid using coimmunoprecipitation analysis with anti-E antibodies. In contrast, when dGDD RNA was produced even in much smaller amounts in PKUN1dGDD DNA-transfected repBHK cells (where it was replicated via complementation), it was packaged into secreted virus particles, Thus, packaging of defective Kunjin virus RNA could occur only when it was replicated. Our results with genome-length Kunjin virus RNA and the results with poliovirus replicon RNA (C, I. Nugent et al,, J, Virol, 73:427-435, 1999), both demonstrating the necessity for the RNA to be replicated before it can be packaged, strongly suggest the existence of a common mechanism for minimizing amplification and transmission of defective RNAs among the quasispecies in positive-strand RNA viruses, This mechanism may thus help alleviate the high-copy error rate of RNA-dependent RNA polymerases.
Resumo:
A plasmid DNA directing transcription of the infectious full-length RNA genome of Kunjin (KUN) virus in vivo from a mammalian expression promoter was used to vaccinate mice intramuscularly. The KUN viral cDNA encoded in the plasmid contained the mutation in the NS1 protein (Pro-250 to Leu) previously shown to attenuate KUN virus in weanling mice. KUN virus was isolated from the blood of immunized mice 3-4 days after DNA inoculation, demonstrating that infectious RNA was being transcribed in vivo; however, no symptoms of virus-induced disease were observed. By 19 days postimmunization, neutralizing antibody was detected in the serum of immunized animals. On challenge with lethal doses of the virulent New York strain of West Nile (WN) or wild-type KUN virus intracerebrally or intraperitoneally, mice immunized with as little as 0.1-1 mug of KUN plasmid DNA were solidly protected against disease. This finding correlated with neutralization data in vitro showing that serum from KUN DNA-immunized mice neutralized KUN and WN,viruses with similar efficiencies. The results demonstrate that delivery of an attenuated but replicating KUN virus via a plasmid DNA vector may provide an effective vaccination strategy against virulent strains of WN virus.
Resumo:
We describe a streamlined reverse transcription-polymerase chain reaction methodology for constructing full-length cDNA libraries of trypanosomatids on the basis of conserved sequences located at the 5' and 3'ends of trans-spliced mRNAs. The amplified cDNA corresponded to full-length messengers and was amenable to in vitro expression. Fractionated libraries could be rapidly constructed in a plasmid vector by the TA cloning method (Invitrogen). We believe this is useful when there are concerns over the use of restriction enzymes and phage technology as well as in cases where expression of proteins in their native conformation is desired.
Resumo:
Cytochrome p450s (cyp450s) are a family of structurally related proteins, with diverse functions, including steroid synthesis and breakdown of toxins. This paper reports the full-length sequence of a novel cyp450 gene, the first to be isolated from the tropical freshwater snail Biomphalaria glabrata, an important intermediate host of Schistosoma mansoni. The nucleotide sequence is 2291 bp with a predicted amino acid sequence of 584aa. The sequence demonstrates conserved cyp450 structural motifs, but is sufficiently different from previously reported cyp450 sequences to be given a new classification, CYP320A1. Initially identified as down-regulated in partially resistant snails in response to S. mansoni infection, amplification of this gene using RT-PCR in both totally resistant or susceptible snail lines when exposed to infection, and all tissues examined, suggests ubiquitous expression. Characterization of the first cyp450 from B. glabrata is significant in understanding the evolution of these metabolically important proteins.