9 resultados para ESTs
em National Center for Biotechnology Information - NCBI
Resumo:
There is no control over the information provided with sequences when they are deposited in the sequence databases. Consequently mistakes can seed the incorrect annotation of other sequences. Grouping genes into families and applying controlled annotation overcomes the problems of incorrect annotation associated with individual sequences. Two databases (http://www.mendel.ac.uk) were created to apply controlled annotation to plant genes and plant ESTs: Mendel-GFDb is a database of plant protein (gene) families based on gapped-BLAST analysis of all sequences in the SWISS-PROT family of databases. Sequences are aligned (ClustalW) and identical and similar residues shaded. The families are visually curated to ensure that one or more criteria, for example overall relatedness and/or domain similarity relate all sequences within a family. Sequence families are assigned a ‘Gene Family Number’ and a unified description is developed which best describes the family and its members. If authority exists the gene family is assigned a ‘Gene Family Name’. This information is placed in Mendel-GFDb. Mendel-ESTS is primarily a database of plant ESTs, which have been compared to Mendel-GFDb, completely sequenced genomes and domain databases. This approach associated ESTs with individual sequences and the controlled annotation of gene families and protein domains; the information being placed in Mendel-ESTS. The controlled annotation applied to genes and ESTs provides a basis from which a plant transcription database can be developed.
Resumo:
Vegetable oils that contain fatty acids with conjugated double bonds, such as tung oil, are valuable drying agents in paints, varnishes, and inks. Although several reaction mechanisms have been proposed, little is known of the biosynthetic origin of conjugated double bonds in plant fatty acids. An expressed sequence tag (EST) approach was undertaken to characterize the enzymatic basis for the formation of the conjugated double bonds of α-eleostearic (18:3Δ9cis,11trans,13trans) and α-parinaric (18:4Δ9cis,11trans,13trans,15cis) acids. Approximately 3,000 ESTs were generated from cDNA libraries prepared from developing seeds of Momordica charantia and Impatiens balsamina, tissues that accumulate large amounts of α-eleostearic and α-parinaric acids, respectively. From ESTs of both species, a class of cDNAs encoding a diverged form of the Δ12-oleic acid desaturase was identified. Expression of full-length cDNAs for the Momordica (MomoFadX) and Impatiens (ImpFadX) enzymes in somatic soybean embryos resulted in the accumulation of α-eleostearic and α-parinaric acids, neither of which is present in untransformed soybean embryos. α-Eleostearic and α-parinaric acids together accounted for as much as 17% (wt/wt) of the total fatty acids of embryos expressing MomoFadX. These results demonstrate the ability to produce fatty acid components of high-value drying oils in transgenic plants. These findings also demonstrate a previously uncharacterized activity for Δ12-oleic acid desaturase-type enzymes that we have termed “conjugase.”
Resumo:
A rapidly growing area of genome research is the generation of expressed sequence tags (ESTs) in which large numbers of randomly selected cDNA clones are partially sequenced. The collection of ESTs reflects the level and complexity of gene expression in the sampled tissue. To date, the majority of plant ESTs are from nonwoody plants such as Arabidopsis, Brassica, maize, and rice. Here, we present a large-scale production of ESTs from the wood-forming tissues of two poplars, Populus tremula L. × tremuloides Michx. and Populus trichocarpa ‘Trichobel.’ The 5,692 ESTs analyzed represented a total of 3,719 unique transcripts for the two cDNA libraries. Putative functions could be assigned to 2,245 of these transcripts that corresponded to 820 protein functions. Of specific interest to forest biotechnology are the 4% of ESTs involved in various processes of cell wall formation, such as lignin and cellulose synthesis, 5% similar to developmental regulators and members of known signal transduction pathways, and 2% involved in hormone biosynthesis. An additional 12% of the ESTs showed no significant similarity to any other DNA or protein sequences in existing databases. The absence of these sequences from public databases may indicate a specific role for these proteins in wood formation. The cDNA libraries and the accompanying database are valuable resources for forest research directed toward understanding the genetic control of wood formation and future endeavors to modify wood and fiber properties for industrial use.
Resumo:
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.
Resumo:
The objective of database AsMamDB is to facilitate the systematic study of alternatively spliced genes of mammals. Version 1.0 of AsMamDB contains 1563 alternatively spliced genes of human, mouse and rat, each associated with a cluster of nucleotide sequences. The main information provided by AsMamDB includes gene alternative splicing patterns, gene structures, locations in chromosomes, products of genes and tissues where they express. Alternative splicing patterns are represented by multiple alignments of various gene transcripts and by graphs of their topological structures. Gene structures are illustrated by exon, intron and various regulatory elements distributions. There are 4204 DNAs, 3977 mRNAs, 8989 CDSs and 126 931 ESTs in the current database. More than 130 000 GenBank entries are covered and 4443 MEDLINE records are linked. DNA, mRNA, exon, intron and relevant regulatory element sequences are provided in FASTA format. More information can be obtained by using the web-based multiple alignment tool Asalign and various category lists. AsMamDB can be accessed at http://166.111.30.6 5/ASMAM DB.html.
Resumo:
The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power of linking cDNA sequence data (including EST sequences) with transcript profiles revealed by cDNA-AFLP, a highly reproducible differential display method based on restriction enzyme digests and selective amplification under high stringency conditions. We have developed a computer program (GenEST) that predicts the sizes of virtual transcript-derived fragments (TDFs) of in silico-digested cDNA sequences retrieved from databases. The vast majority of the resulting virtual TDFs could be traced back among the thousands of TDFs displayed on cDNA-AFLP gels. Sequencing of the corresponding bands excised from cDNA-AFLP gels revealed no inconsistencies. As a consequence, cDNA sequence databases can be screened very efficiently to identify genes with relevant expression profiles. The other way round, it is possible to switch from cDNA-AFLP gels to sequences in the databases. Using the restriction enzyme recognition sites, the primer extensions and the estimated TDF size as identifiers, the DNA sequence(s) corresponding to a TDF with an interesting expression pattern can be identified. In this paper we show examples in both directions by analyzing the plant parasitic nematode Globodera rostochiensis. Various novel pathogenicity factors were identified by combining ESTs from the infective stage juveniles with expression profiles of ∼4000 genes in five developmental stages produced by cDNA-AFLP.
Resumo:
Sequence comparisons of genomes or expressed sequence tags (ESTs) from related organisms provide insight into functional conservation and diversification. We compare the sequences of ESTs from the male accessory gland of Drosophila simulans to their orthologs in its close relative Drosophila melanogaster, and demonstrate rapid divergence of many of these reproductive genes. Nineteen (∼11%) of 176 independent genes identified in the EST screen contain protein-coding regions with an excess of nonsynonymous over synonymous changes, suggesting that their divergence has been accelerated by positive Darwinian selection. Genes that encode putative accessory gland-specific seminal fluid proteins had a significantly elevated level of nonsynonymous substitution relative to nonaccessory gland-specific genes. With the 57 new accessory gland genes reported here, we predict that ∼90% of the male accessory gland genes have been identified. The evolutionary EST approach applied here to identify putative targets of adaptive evolution is readily applicable to other tissues and organisms.
Resumo:
We describe a method to screen pools of DNA from multiple transposon lines for insertions in many genes simultaneously. We use thermal asymmetric interlaced–PCR, a hemispecific PCR amplification protocol that combines nested, insertion-specific primers with degenerate primers, to amplify DNA flanking the transposons. In reconstruction experiments with previously characterized Arabidopsis lines carrying insertions of the maize Dissociation (Ds) transposon, we show that fluorescently labeled, transposon-flanking fragments overlapping ORFs hybridize to cognate expressed sequence tags (ESTs) on a DNA microarray. We further show that insertions can be detected in DNA pools from as many as 100 plants representing different transposon lines and that all of the tested, transposon-disrupted genes whose flanking fragments can be amplified individually also can be detected when amplified from the pool. The ability of a transposon-flanking fragment to hybridize declines rapidly with decreasing homology to the spotted DNA fragment, so that only ESTs with >90% homology to the transposon-disrupted gene exhibit significant cross-hybridization. Because thermal asymmetric interlaced–PCR fragments tend to be short, use of the present method favors recovery of insertions in and near genes. We apply the technique to screening pools of new Ds lines using cDNA microarrays containing ESTs for ≈1,000 stress-induced and -repressed Arabidopsis genes.
Resumo:
The physical map of the 100-Mb Caenorhabditis elegans genome consists of 17,500 cosmids and 3500 yeast artificial chromosomes (YACs). A total of 22.5 Mb has been sequenced, with the remainder expected by 1998. A further 15.5 Mb of unfinished sequence is freely available online: because the areas sequenced so far are relatively gene rich, about half the 13,000 genes can now be scanned. More than a quarter of the genes are represented by expressed sequence tags (ESTs). All information pertaining to the genome is publicly available in the ACeDB data base.