986 resultados para SHORT EXACT SEQUENCE


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The National Institute of Standards and Technology (NIST) has compiled and maintained a Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) since 1997 commonly referred to as STRBase. This database is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing. Observed alleles and annotated sequence for each STR locus are described along with a review of STR analysis technologies. Additionally, commercially available STR multiplex kits are described, published polymerase chain reaction (PCR) primer sequences are reported, and validation studies conducted by a number of forensic laboratories are listed. To supplement the technical information, addresses for scientists and hyperlinks to organizations working in this area are available, along with the comprehensive reference list of over 1300 publications on STRs used for DNA typing purposes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a need for faster and more sensitive algorithms for sequence similarity searching in view of the rapidly increasing amounts of genomic sequence data available. Parallel processing capabilities in the form of the single instruction, multiple data (SIMD) technology are now available in common microprocessors and enable a single microprocessor to perform many operations in parallel. The ParAlign algorithm has been specifically designed to take advantage of this technology. The new algorithm initially exploits parallelism to perform a very rapid computation of the exact optimal ungapped alignment score for all diagonals in the alignment matrix. Then, a novel heuristic is employed to compute an approximate score of a gapped alignment by combining the scores of several diagonals. This approximate score is used to select the most interesting database sequences for a subsequent Smith–Waterman alignment, which is also parallelised. The resulting method represents a substantial improvement compared to existing heuristics. The sensitivity and specificity of ParAlign was found to be as good as Smith–Waterman implementations when the same method for computing the statistical significance of the matches was used. In terms of speed, only the significantly less sensitive NCBI BLAST 2 program was found to outperform the new approach. Online searches are available at http://dna.uio.no/search/

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an ∼25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are at least three short-range gap repressors in the precellular Drosophila embryo: Krüppel, Knirps, and Giant. Krüppel and Knirps contain related repression motifs, PxDLSxH and PxDLSxK, respectively, which mediate interactions with the dCtBP corepressor protein. Here, we present evidence that Giant might also interact with dCtBP. The misexpression of Giant in ventral regions of transgenic embryos results in the selective repression of eve stripe 5. A stripe5-lacZ transgene exhibits an abnormal staining pattern in dCtBP mutants that is consistent with attenuated repression by Giant. The analysis of Gal4-Giant fusion proteins identified a minimal repression domain that contains a sequence motif, VLDLS, which is conserved in at least two other sequence-specific repressors. Removal of this sequence from the native Giant protein does not impair its repression activity in transgenic embryos. We propose that Giant-dCtBP interactions might be indirect and mediated by an unknown bZIP subunit that forms a heteromeric complex with Giant. We also suggest that the VLDLS motif recruits an as yet unidentified corepressor protein.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gene recognition is one of the most important problems in computational molecular biology. Previous attempts to solve this problem were based on statistics, and applications of combinatorial methods for gene recognition were almost unexplored. Recent advances in large-scale cDNA sequencing open a way toward a new approach to gene recognition that uses previously sequenced genes as a clue for recognition of newly sequenced genes. This paper describes a spliced alignment algorithm and software tool that explores all possible exon assemblies in polynomial time and finds the multiexon structure with the best fit to a related protein. Unlike other existing methods, the algorithm successfully recognizes genes even in the case of short exons or exons with unusual codon usage; we also report correct assemblies for genes with more than 10 exons. On a test sample of human genes with known mammalian relatives, the average correlation between the predicted and actual proteins was 99%. The algorithm correctly reconstructed 87% of genes and the rare discrepancies between the predicted and real exon-intron structures were caused either by short (less than 5 amino acids) initial/terminal exons or by alternative splicing. Moreover, the algorithm predicts human genes reasonably well when the homologous protein is nonvertebrate or even prokaryotic. The surprisingly good performance of the method was confirmed by extensive simulations: in particular, with target proteins at 160 accepted point mutations (PAM) (25% similarity), the correlation between the predicted and actual genes was still as high as 95%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Local protein structure prediction efforts have consistently failed to exceed approximately 70% accuracy. We characterize the degeneracy of the mapping from local sequence to local structure responsible for this failure by investigating the extent to which similar sequence segments found in different proteins adopt similar three-dimensional structures. Sequence segments 3-15 residues in length from 154 different protein families are partitioned into neighborhoods containing segments with similar sequences using cluster analysis. The consistency of the sequence-to-structure mapping is assessed by comparing the local structures adopted by sequence segments in the same neighborhood in proteins of known structure. In the 154 families, 45% and 28% of the positions occur in neighborhoods in which one and two local structures predominate, respectively. The sequence patterns that characterize the neighborhoods in the first class probably include virtually all of the short sequence motifs in proteins that consistently occur in a particular local structure. These patterns, many of which occur in transitions between secondary structural elements, are an interesting combination of previously studied and novel motifs. The identification of sequence patterns that consistently occur in one or a small number of local structures in proteins should contribute to the prediction of protein structure from sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A strategy of "sequence scanning" is proposed for rapid acquisition of sequence from clones such as bacteriophage P1 clones, cosmids, or yeast artificial chromosomes. The approach makes use of a special vector, called LambdaScan, that reliably yields subclones with inserts in the size range 8-12 kb. A number of subclones, typically 96 or 192, are chosen at random, and the ends of the inserts are sequenced using vector-specific primers. Then long-range spectrum PCR is used to order and orient the clones. This combination of shotgun and directed sequencing results in a high-resolution physical map suitable for the identification of coding regions or for comparison of sequence organization among genomes. Computer simulations indicate that, for a target clone of 100 kb, the scanning of 192 subclones with sequencing reads as short as 350 bp results in an approximate ratio of 1:2:1 of regions of double-stranded sequence, single-stranded sequence, and gaps. Longer sequencing reads tip the ratio strongly toward increased double-stranded sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Based on our previous transgenic mice results, which strongly suggested that separate cell-specific cis-acting elements of the mouse pro-alpha 1(I) collagen promoter control the activity of the gene in different type I collagen-producing cells, we attempted to delineate a short segment in this promoter that could direct high-level expression selectively in osteoblasts. By generating transgenic mice harboring various fragments of the promoter, we identified a 117-bp segment (-1656 to -1540) that is a minimal sequence able to confer high-level expression of a lacZ reporter gene selectively in osteoblasts when cloned upstream of the proximal 220-bp pro-alpha 1(I) promoter. This 220-bp promoter by itself was inactive in transgenic mice and unable to direct osteoblast-specific expression. The 117-bp enhancer segment contained two sequences that appeared to have different functions. The A sequence (-1656 to -1628) was required to obtain expression of the lacZ gene in osteoblasts, whereas the C sequence (-1575 to -1540) was essential to obtain consistent and high-level expression of the lacZ gene in osteoblasts. Gel shift assays showed that the A sequence bound a nuclear protein present only in osteoblastic cells. A mutation in the A segment that abolished the binding of this osteoblast-specific protein also abolished lacZ expression in osteoblasts of transgenic mice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have identified an amino acid sequence in the Drosophila Transformer (Tra) protein that is capable of directing a heterologous protein to nuclear speckles, regions of the nucleus previously shown to contain high concentrations of spliceosomal small nuclear RNAs and splicing factors. This sequence contains a nucleoplasmin-like bipartite nuclear localization signal (NLS) and a repeating arginine/serine (RS) dipeptide sequence adjacent to a short stretch of basic amino acids. Sequence comparisons from a number of other splicing factors that colocalize to nuclear speckles reveal the presence of one or more copies of this motif. We propose a two-step subnuclear localization mechanism for splicing factors. The first step is transport across the nuclear envelope via the nucleoplasmin-like NLS, while the second step is association with components in the speckled domain via the RS dipeptide sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A short interspersed nuclear element, Mg-SINE, was isolated and characterized from the genome of the rice blast fungus, Magnaporthe grisea. Mg-SINE was isolated as an insertion element within Pot2, an inverted-repeat transposon from M. grisea and shows typical features of a mammalian SINE. Mg-SINE is present as a 0.47-kb interspersed sequence at approximately 100 copies per haploid genome in both rice and non-rice isolates of M. grisea, indicating a common evolutionary origin. Secondary structure analysis of Mg-SINE revealed a tRNA-related region at the 5' end which folds into a cloverleaf structure. Genomic fusions resulting in chimeric Mg-SINEs (Ch-SINEs) composed of a sequence homologous to Mg-SINE at the 3' end and an unrelated sequence at its 5' end were also isolated, indicating that this and other DNA rearrangements mediated by these elements may have a major effect on the genomic architecture of this fungus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural genes and proteins often contain tandemly repeated sequence motifs that dramatically increase physiological specificity and activity. Given the selective value of such repeats, it is likely that several different mechanisms have been responsible for their generation. One mechanism that has been shown to generate relatively long tandem repeats (in the kilobase range) is rolling circle replication. In this communication, we demonstrate that rolling circle synthesis in a simple enzymatic system can produce tandem repeats of monomers as short as 34 bp. In addition to suggesting possible origins for natural tandem repeats, these observations provide a facile means for constructing libraries of repeated motifs for use in "in vitro evolution" experiments designed to select molecules with defined biological or chemical properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article is a short introduction to and review of the cluster-state model of quantum computation, in which coherent quantum information processing is accomplished via a sequence of single-qubit measurements applied to a fixed quantum state known as a cluster state. We also discuss a few novel properties of the model, including a proof that the cluster state cannot occur as the exact ground state of any naturally occurring physical system, and a proof that measurements on any quantum state which is linearly prepared in one dimension can be efficiently simulated on a classical computer, and thus are not candidates for use as a substrate for quantum computation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We use molecular dynamics simulations to compare the conformational structure and dynamics of a 21-base pair RNA sequence initially constructed according to the canonical A-RNA and A'-RNA forms in the presence of counterions and explicit water. Our study aims to add a dynamical perspective to the solid-state structural information that has been derived from X-ray data for these two characteristic forms of RNA. Analysis of the three main structural descriptors commonly used to differentiate between the two forms of RNA namely major groove width, inclination and the number of base pairs in a helical twist over a 30 ns simulation period reveals a flexible structure in aqueous solution with fluctuations in the values of these structural parameters encompassing the range between the two crystal forms and more. This provides evidence to suggest that the identification of distinct A-RNA and A'-RNA structures, while relevant in the crystalline form, may not be generally relevant in the context of RNA in the aqueous phase. The apparent structural flexibility observed in our simulations is likely to bear ramifications for the interactions of RNA with biological molecules (e.g. proteins) and non-biological molecules (e.g. non-viral gene delivery vectors). © CSIRO 2009.