101 resultados para Expressed sequence tag (EST)

em National Center for Biotechnology Information - NCBI


100.00% 100.00%



Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.


100.00% 100.00%



Nerve growth factor-induced differentiation of adrenal chromaffin PC-12 cells to a neuronal phenotype involves alterations in gene expression and represents a model system to study neuronal differentiation. We have used the expressed-sequence-tag approach to identify approximately 600 differentially expressed mRNAs in untreated and nerve growth factor-treated PC-12 cells that encode proteins with diverse structural and biochemical functions. Many of these mRNAs encode proteins belonging to cellular pathways not previously known to be regulated by nerve growth factor. Comparative expressed-sequence-tag analysis provides a basis for surveying global changes in gene-expression patterns in response to biological signals at an unprecedented scale, is a powerful tool for identifying potential interactions between different cellular pathways, and allows the gene-expression profiles of individual genes belonging to a particular pathway to be followed.


100.00% 100.00%



STACK is a tool for detection and visualisation of expressed transcript variation in the context of developmental and pathological states. The datasystem organises and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity. The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index and is accessible via WWW at http://www.sanbi.ac.za/Dbases.html. STACK represents a broadly applicable resource, as it is the only reconstructed transcript database for which the tools for its generation are also broadly available (http://www.sanbi.ac.za/CODES).


100.00% 100.00%



A rapidly growing area of genome research is the generation of expressed sequence tags (ESTs) in which large numbers of randomly selected cDNA clones are partially sequenced. The collection of ESTs reflects the level and complexity of gene expression in the sampled tissue. To date, the majority of plant ESTs are from nonwoody plants such as Arabidopsis, Brassica, maize, and rice. Here, we present a large-scale production of ESTs from the wood-forming tissues of two poplars, Populus tremula L. × tremuloides Michx. and Populus trichocarpa ‘Trichobel.’ The 5,692 ESTs analyzed represented a total of 3,719 unique transcripts for the two cDNA libraries. Putative functions could be assigned to 2,245 of these transcripts that corresponded to 820 protein functions. Of specific interest to forest biotechnology are the 4% of ESTs involved in various processes of cell wall formation, such as lignin and cellulose synthesis, 5% similar to developmental regulators and members of known signal transduction pathways, and 2% involved in hormone biosynthesis. An additional 12% of the ESTs showed no significant similarity to any other DNA or protein sequences in existing databases. The absence of these sequences from public databases may indicate a specific role for these proteins in wood formation. The cDNA libraries and the accompanying database are valuable resources for forest research directed toward understanding the genetic control of wood formation and future endeavors to modify wood and fiber properties for industrial use.


100.00% 100.00%



Natural killer (NK) cells express C-type lectin-like receptors, encoded in the NK gene complex, that interact with major histocompatibility complex class I and either inhibit or activate functional activity. Human NK cells express heterodimers consisting of CD94 and NKG2 family molecules, whereas murine NK cells express homodimers belonging to the Ly-49 family. The corresponding orthologues for other species, however, have not been described. In this report, we used probes derived from the expressed sequence tag database to clone C57BL/6-derived cDNAs homologous to human NKG2-D and CD94. Among normal tissues, murine NKG2-D and CD94 transcripts are highly expressed only in activated NK cells, including both Ly-49A+ and Ly-49A− subpopulations. Additionally, mNKG2-D is expressed in murine NK cell clones KY-1 and KY-2, whereas mCD94 expression is observed only in KY-1 cells but not KY-2. Last, we have finely mapped the physical location of the Cd94 (centromeric) and Nkg2d (telomeric) genes between Cd69 and the Ly49 cluster in the NK complex. Thus, these data indicate the expanding complexity of the NK complex and the corresponding repertoire of C-type lectin-like receptors on murine NK cells.


100.00% 100.00%



The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power of linking cDNA sequence data (including EST sequences) with transcript profiles revealed by cDNA-AFLP, a highly reproducible differential display method based on restriction enzyme digests and selective amplification under high stringency conditions. We have developed a computer program (GenEST) that predicts the sizes of virtual transcript-derived fragments (TDFs) of in silico-digested cDNA sequences retrieved from databases. The vast majority of the resulting virtual TDFs could be traced back among the thousands of TDFs displayed on cDNA-AFLP gels. Sequencing of the corresponding bands excised from cDNA-AFLP gels revealed no inconsistencies. As a consequence, cDNA sequence databases can be screened very efficiently to identify genes with relevant expression profiles. The other way round, it is possible to switch from cDNA-AFLP gels to sequences in the databases. Using the restriction enzyme recognition sites, the primer extensions and the estimated TDF size as identifiers, the DNA sequence(s) corresponding to a TDF with an interesting expression pattern can be identified. In this paper we show examples in both directions by analyzing the plant parasitic nematode Globodera rostochiensis. Various novel pathogenicity factors were identified by combining ESTs from the infective stage juveniles with expression profiles of ∼4000 genes in five developmental stages produced by cDNA-AFLP.


100.00% 100.00%



The root hair is a specialized cell type involved in water and nutrient uptake in plants. In legumes the root hair is also the primary site of recognition and infection by symbiotic nitrogen-fixing Rhizobium bacteria. We have studied the root hairs of Medicago truncatula, which is emerging as an increasingly important model legume for studies of symbiotic nodulation. However, only 27 genes from M. truncatula were represented in GenBank/EMBL as of October, 1997. We report here the construction of a root-hair-enriched cDNA library and single-pass sequencing of randomly selected clones. Expressed sequence tags (899 total, 603 of which have homology to known genes) were generated and made available on the Internet. We believe that the database and the associated DNA materials will provide a useful resource to the community of scientists studying the biology of roots, root tips, root hairs, and nodulation.


100.00% 100.00%



The CCAAT motif is found in the promoters of many eukaryotic genes. In yeast a single complex of three proteins, termed HAP2, HAP3, and HAP5, binds to this sequence, and in mammals the three components of the equivalent complex (called variously NF-Y, CBF, or CP1) are also represented by single genes. Here we report the presence of multiple genes for each of the components of the CCAAT-binding complex, HAP2,3,5, from Arabidopsis. Three independent Arabidopsis HAP subunit 2 (AtHAP2) cDNAs were cloned by functional complementation of a yeast hap2 mutant, and two independent forms each of AtHAP3 and AtHAP5 cDNAs were detected in the expressed sequence tag database. Additional homologs (two of AtHAP3 and one of AtHAP5) have been identified from available Arabidopsis genomic sequences. Northern-blot analysis indicated ubiquitous expression for each AtHAP2 and AtHAP5 cDNA in a range of tissues, whereas expression of each AtHAP3 cDNA was under developmental and/or environmental regulation. The unexpected presence of multiple forms of each HAP homolog in Arabidopsis, compared with the single genes in yeast and vertebrates, suggests that the HAP2,3,5 complex may play diverse roles in gene transcription in higher plants.


100.00% 100.00%



A cDNA encoding human gamma-glutamyl hydrolase has been identified by searching an expressed sequence tag data base and using rat gamma-glutamyl hydrolase cDNA as the query sequence. The cDNA encodes a 318-amino acid protein of Mr 35,960. The deduced amino acid sequence of human gamma-glutamyl hydrolase shows 67% identity to that of rat gamma-glutamyl hydrolase. In both rat and human the 24 amino acids preceding the N terminus constitute a structural motif that is analogous to a leader or signal sequence. There are four consensus asparagine glycosylation sites in the human sequence, with three of them conserved in the rat enzyme. Expression of both the human and rat cDNA in Escherichia coli produced antigenically related proteins with enzyme activities characteristic of the native human and rat enzymes, respectively, when methotrexate di- or pentaglutamate were used as substrates. With the latter substrate the rat enzyme cleaved the innermost gamma-glutamyl linkage resulting in the sole production of methotrexate as the pteroyl containing product. The human enzyme differed in that it produced methotrexate tetraglutamate initially, followed by the triglutamate, and then the diglutamate and methotrexate. Hence the rat enzyme is an endopeptidase with methotrexate pentaglutamate as substrate, whereas the human enzyme exhibits exopeptidase activity. Another difference is that the expressed rat enzyme is equally active on methotrexate di- and pentaglutamate whereas the human enzyme has severalfold greater activity on methotrexate pentaglutamate compared with the diglutamate. These properties are consistent with the enzymes derived from human and rat sources.


100.00% 100.00%



Molecular and fragment ion data of intact 8- to 43-kDa proteins from electrospray Fourier-transform tandem mass spectrometry are matched against the corresponding data in sequence data bases. Extending the sequence tag concept of Mann and Wilm for matching peptides, a partial amino acid sequence in the unknown is first identified from the mass differences of a series of fragment ions, and the mass position of this sequence is defined from molecular weight and the fragment ion masses. For three studied proteins, a single sequence tag retrieved only the correct protein from the data base; a fourth protein required the input of two sequence tags. However, three of the data base proteins differed by having an extra methionine or by missing an acetyl or heme substitution. The positions of these modifications in the protein examined were greatly restricted by the mass differences of its molecular and fragment ions versus those of the data base. To characterize the primary structure of an unknown represented in the data base, this method is fast and specific and does not require prior enzymatic or chemical degradation.


100.00% 100.00%



We report the characterization of a maize Wee1 homologue and its expression in developing endosperm. Using a 0.8-kb cDNA from an expressed sequence tag project, we isolated a 1.6-kb cDNA (ZmWee1), which encodes a protein of 403 aa with a calculated molecular size of 45.6 kDa. The deduced amino acid sequence shows 50% identity to the protein kinase domain of human Wee1. Overexpression of ZmWee1 in Schizosaccharomyces pombe inhibited cell division and caused the cells to enlarge significantly. Recombinant ZmWee1 obtained from Escherichia coli is able to inhibit the activity of p13suc1-adsorbed cyclin-dependent kinase from maize. ZmWee1 is encoded by a single gene at a locus on the long arm of chromosome 4. RNA gel blots showed the ZmWee1 transcript is about 2.4 kb in length and that its abundance reaches a maximum 15 days after pollination in endosperm tissue. High levels of expression of ZmWee1 at this stage of endosperm development imply that ZmWee1 plays a role in endoreduplication. Our results show that control of cyclin-dependent kinase activity by Wee1 is conserved among eukaryotes, from fungi to animals and plants.


100.00% 100.00%



Protease-activated receptors 1–3 (PAR1, PAR2, and PAR3) are members of a unique G protein-coupled receptor family. They are characterized by a tethered peptide ligand at the extracellular amino terminus that is generated by minor proteolysis. A partial cDNA sequence of a fourth member of this family (PAR4) was identified in an expressed sequence tag database, and the full-length cDNA clone has been isolated from a lymphoma Daudi cell cDNA library. The ORF codes for a seven transmembrane domain protein of 385 amino acids with 33% amino acid sequence identity with PAR1, PAR2, and PAR3. A putative protease cleavage site (Arg-47/Gly-48) was identified within the extracellular amino terminus. COS cells transiently transfected with PAR4 resulted in the formation of intracellular inositol triphosphate when treated with either thrombin or trypsin. A PAR4 mutant in which the Arg-47 was replaced with Ala did not respond to thrombin or trypsin. A hexapeptide (GYPGQV) representing the newly exposed tethered ligand from the amino terminus of PAR4 after proteolysis by thrombin activated COS cells transfected with either wild-type or the mutant PAR4. Northern blot showed that PAR4 mRNA was expressed in a number of human tissues, with high levels being present in lung, pancreas, thyroid, testis, and small intestine. By fluorescence in situ hybridization, the human PAR4 gene was mapped to chromosome 19p12.


100.00% 100.00%



Cathepsin B (CTSB) is overexpressed in tumors of the lung, prostate, colon, breast, and stomach. However, evidence of primary genomic alterations in the CTSB gene during tumor initiation or progression has been lacking. We have found a novel amplicon at 8p22–23 that results in CTSB overexpression in esophageal adenocarcinoma. Amplified genomic NotI–HinfI fragments were identified by two-dimensional DNA electrophoresis. Two amplified fragments (D4 and D5) were cloned and yielded unique sequences. Using bacterial artificial chromosome clones containing either D4 or D5, fluorescent in situ hybridization defined a single region of amplification involving chromosome bands 8p22–23. We investigated the candidate cancer-related gene CTSB, and potential coamplified genes from this region including farnesyl-diphosphate farnesyltransferase (FDFT1), arylamine N-acetyltransferase (NAT-1), lipoprotein lipase (LPL), and an uncharacterized expressed sequence tag (D8S503). Southern blot analysis of 66 esophageal adenocarcinomas demonstrated only CTSB and FDFT1 were consistently amplified in eight (12.1%) of the tumors. Neither NAT-1 nor LPL were amplified. Northern blot analysis showed overexpression of CTSB and FDFT1 mRNA in all six of the amplified esophageal adenocarcinomas analyzed. CTSB mRNA overexpression also was present in two of six nonamplified tumors analyzed. However, FDFT1 mRNA overexpression without amplification was not observed. Western blot analysis confirmed CTSB protein overexpression in tumor specimens with CTSB mRNA overexpression compared with either normal controls or tumors without mRNA overexpression. Abundant extracellular expression of CTSB protein was found in 29 of 40 (72.5%) of esophageal adenocarcinoma specimens by using immunohistochemical analysis. The finding of an amplicon at 8p22–23 resulting in CTSB gene amplification and overexpression supports an important role for CTSB in esophageal adenocarcinoma and possibly in other tumors.


100.00% 100.00%



Vegetable oils that contain fatty acids with conjugated double bonds, such as tung oil, are valuable drying agents in paints, varnishes, and inks. Although several reaction mechanisms have been proposed, little is known of the biosynthetic origin of conjugated double bonds in plant fatty acids. An expressed sequence tag (EST) approach was undertaken to characterize the enzymatic basis for the formation of the conjugated double bonds of α-eleostearic (18:3Δ9cis,11trans,13trans) and α-parinaric (18:4Δ9cis,11trans,13trans,15cis) acids. Approximately 3,000 ESTs were generated from cDNA libraries prepared from developing seeds of Momordica charantia and Impatiens balsamina, tissues that accumulate large amounts of α-eleostearic and α-parinaric acids, respectively. From ESTs of both species, a class of cDNAs encoding a diverged form of the Δ12-oleic acid desaturase was identified. Expression of full-length cDNAs for the Momordica (MomoFadX) and Impatiens (ImpFadX) enzymes in somatic soybean embryos resulted in the accumulation of α-eleostearic and α-parinaric acids, neither of which is present in untransformed soybean embryos. α-Eleostearic and α-parinaric acids together accounted for as much as 17% (wt/wt) of the total fatty acids of embryos expressing MomoFadX. These results demonstrate the ability to produce fatty acid components of high-value drying oils in transgenic plants. These findings also demonstrate a previously uncharacterized activity for Δ12-oleic acid desaturase-type enzymes that we have termed “conjugase.”


100.00% 100.00%



The Drosophila retinal degeneration C (rdgC) gene encodes an unusual protein serine/threonine phosphatase in that it contains at least two EF-hand motifs at its carboxy terminus. By a combination of large-scale sequencing of human retina cDNA clones and searches of expressed sequence tag and genomic DNA databases, we have identified two sequences in mammals [Protein Phosphatase with EF-hands-1 and 2 (PPEF-1 and PPEF-2)] and one in Caenorhabditis elegans (PPEF) that closely resemble rdgC. In the adult, PPEF-2 is expressed specifically in retinal rod photoreceptors and the pineal. In the retina, several isoforms of PPEF-2 are predicted to arise from differential splicing. The isoform that most closely resembles rdgC is localized to rod inner segments. Together with the recently described localization of PPEF-1 transcripts to primary somatosensory neurons and inner ear cells in the developing mouse, these data suggest that the PPEF family of protein serine/threonine phosphatases plays a specific and conserved role in diverse sensory neurons.