5 resultados para geometrically growing sequence
em National Center for Biotechnology Information - NCBI
Resumo:
Eight novel families of miniature inverted repeat transposable elements (MITEs) were discovered in the African malaria mosquito, Anopheles gambiae, by using new software designed to rapidly identify MITE-like sequences based on their structural characteristics. Divergent subfamilies have been found in two families. Past mobility was demonstrated by evidence of MITE insertions that resulted in the duplication of specific TA, TAA, or 8-bp targets. Some of these MITEs share the same target duplications and similar terminal sequences with MITEs and other DNA transposons in human and other organisms. MITEs in A. gambiae range from 40 to 1340 copies per genome, much less abundant than MITEs in the yellow fever mosquito, Aedes aegypti. Statistical analyses suggest that most A. gambiae MITEs are in highly AT-rich regions, many of which are closely associated with each other. The analyses of these novel MITEs underscored interesting questions regarding their diversity, origin, evolution, and relationships to the host genomes. The discovery of diverse families of MITEs in A. gambiae has important practical implications in light of current efforts to control malaria by replacing vector mosquitoes with genetically modified refractory mosquitoes. Finally, the systematic approach to rapidly identify novel MITEs should have broad applications for the analysis of the ever-growing sequence databases of a wide range of organisms.
Resumo:
A rapidly growing area of genome research is the generation of expressed sequence tags (ESTs) in which large numbers of randomly selected cDNA clones are partially sequenced. The collection of ESTs reflects the level and complexity of gene expression in the sampled tissue. To date, the majority of plant ESTs are from nonwoody plants such as Arabidopsis, Brassica, maize, and rice. Here, we present a large-scale production of ESTs from the wood-forming tissues of two poplars, Populus tremula L. × tremuloides Michx. and Populus trichocarpa ‘Trichobel.’ The 5,692 ESTs analyzed represented a total of 3,719 unique transcripts for the two cDNA libraries. Putative functions could be assigned to 2,245 of these transcripts that corresponded to 820 protein functions. Of specific interest to forest biotechnology are the 4% of ESTs involved in various processes of cell wall formation, such as lignin and cellulose synthesis, 5% similar to developmental regulators and members of known signal transduction pathways, and 2% involved in hormone biosynthesis. An additional 12% of the ESTs showed no significant similarity to any other DNA or protein sequences in existing databases. The absence of these sequences from public databases may indicate a specific role for these proteins in wood formation. The cDNA libraries and the accompanying database are valuable resources for forest research directed toward understanding the genetic control of wood formation and future endeavors to modify wood and fiber properties for industrial use.
Resumo:
STACK is a tool for detection and visualisation of expressed transcript variation in the context of developmental and pathological states. The datasystem organises and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity. The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index and is accessible via WWW at http://www.sanbi.ac.za/Dbases.html. STACK represents a broadly applicable resource, as it is the only reconstructed transcript database for which the tools for its generation are also broadly available (http://www.sanbi.ac.za/CODES).
Resumo:
Current evidence on the long-term evolutionary effect of insertion of sequence elements into gene regions is reviewed, restricted to cases where a sequence derived from a past insertion participates in the regulation of expression of a useful gene. Ten such examples in eukaryotes demonstrate that segments of repetitive DNA or mobile elements have been inserted in the past in gene regions, have been preserved, sometimes modified by selection, and now affect control of transcription of the adjacent gene. Included are only examples in which transcription control was modified by the insert. Several cases in which merely transcription initiation occurred in the insert were set aside. Two of the examples involved the long terminal repeats of mammalian endogenous retroviruses. Another two examples were control of transcription by repeated sequence inserts in sea urchin genomes. There are now six published examples in which Alu sequences were inserted long ago into human gene regions, were modified, and now are central in control/enhancement of transcription. The number of published examples of Alu sequences affecting gene control has grown threefold in the last year and is likely to continue growing. Taken together, all of these examples show that the insertion of sequence elements in the genome has been a significant source of regulatory variation in evolution.
Resumo:
Expansins are unusual proteins discovered by virtue of their ability to mediate cell wall extension in plants. We identified cDNA clones for two cucumber expansins on the basis of peptide sequences of proteins purified from cucumber hypocotyls. The expansin cDNAs encode related proteins with signal peptides predicted to direct protein secretion to the cell wall. Northern blot analysis showed moderate transcript abundance in the growing region of the hypocotyl and no detectable transcripts in the nongrowing region. Rice and Arabidopsis expansin cDNAs were identified from collections of anonymous cDNAs (expressed sequence tags). Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. Expansins are highly conserved in size and sequence (60-87% amino acid sequence identity and 75-95% similarity between any pairwise comparison), and phylogenetic trees indicate that this multigene family formed before the evolutionary divergence of monocotyledons and dicotyledons. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. A series of highly conserved tryptophans may function in expansin binding to cellulose or other glycans. The high conservation of this multigene family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure.