3 resultados para Positional Weight Matrices

em National Center for Biotechnology Information - NCBI


Relevância:

80.00% 80.00%

Publicador:

Resumo:

A database (SpliceDB) of known mammalian splice site sequences has been developed. We extracted 43 337 splice pairs from mammalian divisions of the gene-centered Infogene database, including sites from incomplete or alternatively spliced genes. Known EST sequences supported 22 815 of them. After discarding sequences with putative errors and ambiguous location of splice junctions the verified dataset includes 22 489 entries. Of these, 98.71% contain canonical GT–AG junctions (22 199 entries) and 0.56% have non-canonical GC–AG splice site pairs. The remainder (0.73%) occurs in a lot of small groups (with a maximum size of 0.05%). We especially studied non-canonical splice sites, which comprise 3.73% of GenBank annotated splice pairs. EST alignments allowed us to verify only the exonic part of splice sites. To check the conservative dinucleotides we compared sequences of human non-canonical splice sites with sequences from the high throughput genome sequencing project (HTG). Out of 171 human non-canonical and EST-supported splice pairs, 156 (91.23%) had a clear match in the human HTG. They can be classified after sequence analysis as: 79 GC–AG pairs (of which one was an error that corrected to GC–AG), 61 errors corrected to GT–AG canonical pairs, six AT–AC pairs (of which two were errors corrected to AT–AC), one case was produced from a non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two other cases left of supported non-canonical splice pairs. The information about verified splice site sequences for canonical and non-canonical sites is presented in SpliceDB with the supporting evidence. We also built weight matrices for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Centre: http://genomic.sanger.ac.uk/spldb/SpliceDB.html and at http://www.softberry.com/spldb/SpliceDB.html.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

rSNP_Guide is a novel curated database system for analysis of transcription factor (TF) binding to target sequences in regulatory gene regions altered by mutations. It accumulates experimental data on naturally occurring site variants in regulatory gene regions and site-directed mutations. This database system also contains the web tools for SNP analysis, i.e., active applet applying weight matrices to predict the regulatory site candidates altered by a mutation. The current version of the rSNP_Guide is supplemented by six sub-databases: (i) rSNP_DB, on DNA–protein interaction caused by mutation; (ii) SYSTEM, on experimental systems; (iii) rSNP_BIB, on citations to original publications; (iv) SAMPLES, on experimentally identified sequences of known regulatory sites; (v) MATRIX, on weight matrices of known TF sites; (vi) rSNP_Report, on characteristic examples of successful rSNP_Tools implementation. These databases are useful for the analysis of natural SNPs and site-directed mutations. The databases are available through the Web, http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A high-resolution physical and genetic map of a major fruit weight quantitative trait locus (QTL), fw2.2, has been constructed for a region of tomato chromosome 2. Using an F2 nearly isogenic line mapping population (3472 individuals) derived from Lycopersicon esculentum (domesticated tomato) × Lycopersicon pennellii (wild tomato), fw2.2 has been placed near TG91 and TG167, which have an interval distance of 0.13 ± 0.03 centimorgan. The physical distance between TG91 and TG167 was estimated to be ≤ 150 kb by pulsed-field gel electrophoresis of tomato DNA. A physical contig composed of six yeast artificial chromosomes (YACs) and encompassing fw2.2 was isolated. No rearrangements or chimerisms were detected within the YAC contig based on restriction fragment length polymorphism analysis using YAC-end sequences and anchored molecular markers from the high-resolution map. Based on genetic recombination events, fw2.2 could be narrowed down to a region less than 150 kb between molecular markers TG91 and HSF24 and included within two YACs: YAC264 (210 kb) and YAC355 (300 kb). This marks the first time, to our knowledge, that a QTL has been mapped with such precision and delimited to a segment of cloned DNA. The fact that the phenotypic effect of the fw2.2 QTL can be mapped to a small interval suggests that the action of this QTL is likely due to a single gene. The development of the high-resolution genetic map, in combination with the physical YAC contig, suggests that the gene responsible for this QTL and other QTLs in plants can be isolated using a positional cloning strategy. The cloning of fw2.2 will likely lead to a better understanding of the molecular biology of fruit development and to the genetic engineering of fruit size characteristics.