102 resultados para Genomic sequence database


Relevância:

40.00% 40.00%

Publicador:

Resumo:

A database (SpliceDB) of known mammalian splice site sequences has been developed. We extracted 43 337 splice pairs from mammalian divisions of the gene-centered Infogene database, including sites from incomplete or alternatively spliced genes. Known EST sequences supported 22 815 of them. After discarding sequences with putative errors and ambiguous location of splice junctions the verified dataset includes 22 489 entries. Of these, 98.71% contain canonical GT–AG junctions (22 199 entries) and 0.56% have non-canonical GC–AG splice site pairs. The remainder (0.73%) occurs in a lot of small groups (with a maximum size of 0.05%). We especially studied non-canonical splice sites, which comprise 3.73% of GenBank annotated splice pairs. EST alignments allowed us to verify only the exonic part of splice sites. To check the conservative dinucleotides we compared sequences of human non-canonical splice sites with sequences from the high throughput genome sequencing project (HTG). Out of 171 human non-canonical and EST-supported splice pairs, 156 (91.23%) had a clear match in the human HTG. They can be classified after sequence analysis as: 79 GC–AG pairs (of which one was an error that corrected to GC–AG), 61 errors corrected to GT–AG canonical pairs, six AT–AC pairs (of which two were errors corrected to AT–AC), one case was produced from a non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two other cases left of supported non-canonical splice pairs. The information about verified splice site sequences for canonical and non-canonical sites is presented in SpliceDB with the supporting evidence. We also built weight matrices for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Centre: http://genomic.sanger.ac.uk/spldb/SpliceDB.html and at http://www.softberry.com/spldb/SpliceDB.html.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This report describes an efficient strategy for determining the functions of sequenced genes in microorganisms. A large population of cells is subjected to insertional mutagenesis. The mutagenized population is then divided into representative samples, each of which is subjected to a different selection. DNA is prepared from each sample population after the selection. The polymerase chain reaction is then used to determine retrospectively whether insertions into a particular sequence affected the outcome of any selection. The method is efficient because the insertional mutagenesis and each selection need only to be performed once to enable the functions of thousands of genes to be investigated, rather than once for each gene. We tested this "genetic footprinting" strategy using the model organism Saccharomyces cerevisiae.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Whole-genome duplication approximately 108 years ago was proposed as an explanation for the many duplicated chromosomal regions in Saccharomyces cerevisiae. Here we have used computer simulations and analytic methods to estimate some parameters describing the evolution of the yeast genome after this duplication event. Computer simulation of a model in which 8% of the original genes were retained in duplicate after genome duplication, and 70–100 reciprocal translocations occurred between chromosomes, produced arrangements of duplicated chromosomal regions very similar to the map of real duplications in yeast. An analytical method produced an independent estimate of 84 map disruptions. These results imply that many smaller duplicated chromosomal regions exist in the yeast genome in addition to the 55 originally reported. We also examined the possibility of determining the original order of chromosomal blocks in the ancestral unduplicated genome, but this cannot be done without information from one or more additional species. If the genome sequence of one other species (such as Kluyveromyces lactis) were known it should be possible to identify 150–200 paired regions covering the whole yeast genome and to reconstruct approximately two-thirds of the original order of blocks of genes in yeast. Rates of interchromosome translocation in yeast and mammals appear similar despite their very different rates of homologous recombination per kilobase.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tangier disease is characterized by low serum high density lipoproteins and a biochemical defect in the cellular efflux of lipids to high density lipoproteins. ABC1, a member of the ATP-binding cassette family, recently has been identified as the defective gene in Tangier disease. We report here the organization of the human ABC1 gene and the identification of a mutation in the ABC1 gene from the original Tangier disease kindred. The organization of the human ABC1 gene is similar to that of the mouse ABC1 gene and other related ABC genes. The ABC1 gene contains 49 exons that range in size from 33 to 249 bp and is over 70 kb in length. Sequence analysis of the ABC1 gene revealed that the proband for Tangier disease was homozygous for a deletion of nucleotides 3283 and 3284 (TC) in exon 22. The deletion results in a frameshift mutation and a premature stop codon starting at nucleotide 3375. The product is predicted to encode a nonfunctional protein of 1,084 aa, which is approximately half the size of the full-length ABC1 protein. The loss of a Mnl1 restriction site, which results from the deletion, was used to establish the genotype of the rest of the kindred. In summary, we report on the genomic organization of the human ABC1 gene and identify a frameshift mutation in the ABC1 gene of the index case of Tangier disease. These results will be useful in the future characterization of the structure and function of the ABC1 gene and the analysis of additional ABC1 mutations in patients with Tangier disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Panhandle PCR amplifies genomic DNA with known 5′ and unknown 3′ sequences from a template with an intrastrand loop schematically shaped like a pan with a handle. We used panhandle PCR to clone MLL genomic breakpoints in two pediatric treatment-related leukemias. The karyotype in a case of treatment-related acute lymphoblastic leukemia showed the t(4;11)(q21;q23). Panhandle PCR amplified the translocation breakpoint at position 2158 in intron 6 in the 5′ MLL breakpoint cluster region (bcr). The karyotype in a case of treatment-related acute myeloid leukemia was normal, but Southern blot analysis showed a single MLL gene rearrangement. Panhandle PCR amplified the breakpoint at position 1493 in MLL intron 6. Screening of somatic cell hybrid and radiation hybrid DNAs by PCR and reverse transcriptase-PCR analysis of the leukemic cells indicated that panhandle PCR identified a fusion of MLL intron 6 with a previously uncharacterized sequence in MLL intron 1, consistent with a partial duplication. In both cases, the breakpoints in the MLL bcr were in Alu repeats, and there were Alu repeats in proximity to the breakpoints in the partner DNAs, suggesting that Alu sequences were relevant to these rearrangements. This study shows that panhandle PCR is an effective method for cloning MLL genomic breakpoints in treatment-related leukemias. Analysis of additional pediatric cases will determine whether breakpoint distribution deviates from the predilection for 3′ distribution in the bcr that has been found in adult cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A rapidly growing area of genome research is the generation of expressed sequence tags (ESTs) in which large numbers of randomly selected cDNA clones are partially sequenced. The collection of ESTs reflects the level and complexity of gene expression in the sampled tissue. To date, the majority of plant ESTs are from nonwoody plants such as Arabidopsis, Brassica, maize, and rice. Here, we present a large-scale production of ESTs from the wood-forming tissues of two poplars, Populus tremula L. × tremuloides Michx. and Populus trichocarpa ‘Trichobel.’ The 5,692 ESTs analyzed represented a total of 3,719 unique transcripts for the two cDNA libraries. Putative functions could be assigned to 2,245 of these transcripts that corresponded to 820 protein functions. Of specific interest to forest biotechnology are the 4% of ESTs involved in various processes of cell wall formation, such as lignin and cellulose synthesis, 5% similar to developmental regulators and members of known signal transduction pathways, and 2% involved in hormone biosynthesis. An additional 12% of the ESTs showed no significant similarity to any other DNA or protein sequences in existing databases. The absence of these sequences from public databases may indicate a specific role for these proteins in wood formation. The cDNA libraries and the accompanying database are valuable resources for forest research directed toward understanding the genetic control of wood formation and future endeavors to modify wood and fiber properties for industrial use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fanconi anemia (FA) is a genetically heterogeneous autosomal recessive syndrome associated with chromosomal instability, hypersensitivity to DNA crosslinking agents, and predisposition to malignancy. The gene for FA complementation group A (FAA) recently has been cloned. The cDNA is predicted to encode a polypeptide of 1,455 amino acids, with no homologies to any known protein that might suggest a function for FAA. We have used single-strand conformational polymorphism analysis to screen genomic DNA from a panel of 97 racially and ethnically diverse FA patients from the International Fanconi Anemia Registry for mutations in the FAA gene. A total of 85 variant bands were detected. Forty-five of the variants are probably benign polymorphisms, of which nine are common and can be used for various applications, including mapping studies for other genes in this region of chromosome 16q. Amplification refractory mutation system assays were developed to simplify their detection. Forty variants are likely to be pathogenic mutations. Seventeen of these are microdeletions/microinsertions associated with short direct repeats or homonucleotide tracts, a type of mutation thought to be generated by a mechanism of slipped-strand mispairing during DNA replication. A screening of 350 FA probands from the International Fanconi Anemia Registry for two of these deletions (1115–1118del and 3788–3790del) revealed that they are carried on about 2% and 5% of the FA alleles, respectively. 3788–3790del appears in a variety of ethnic groups and is found on at least two different haplotypes. We suggest that FAA is hypermutable, and that slipped-strand mispairing, a mutational mechanism recognized as important for the generation of germ-line and somatic mutations in a variety of cancer-related genes, including p53, APC, RB1, WT1, and BRCA1, may be a major mechanism for FAA mutagenesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prophenoloxidase, a melanin-synthesizing enzyme, is considered to be an important arthropod immune protein. In mosquitoes, prophenoloxidase has been shown to be involved in refractory mechanisms against malaria parasites. In our study we used Anopheles gambiae, the most important human malaria vector, to characterize the first arthropod prophenoloxidase gene at the genomic level. The complete nucleotide sequence, including the immediate 5′ flanking sequence (−855 bp) of the prophenoloxidase 1 gene, was determined. The gene spans 10 kb and is composed of five exons and four introns coding for a 2.5-kb mRNA. In the 5′ flanking sequence, we found several putative regulatory motifs, two of which were identified as ecdysteroid regulatory elements. Electrophoretic mobility gel-shift assays and supershift assays demonstrated that the Aedes aegypti ecdysone receptor/Ultraspiracle nuclear receptor complex, and, seemingly, the endogenous Anopheles gambiae nuclear receptor complex, was able to bind one of the ecdysteroid response elements. Furthermore, 20-hydroxyecdysone stimulation was shown to up-regulate the transcription of the prophenoloxidase 1 gene in an A. gambiae cell line.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cancer cell genomes contain alterations beyond known etiologic events, but their total number has been unknown at even the order of magnitude level. By sampling colorectal premalignant polyp and carcinoma cell genomes through use of the technique inter-(simple sequence repeat) PCR, we have found genomic alterations to be considerably more abundant than expected, with the mean number of genomic events per carcinoma cell totaling approximately 11,000. Colonic polyps early in the tumor progression pathway showed similar numbers of events. These results indicate that, as with certain hereditary cancer syndromes, genomic destabilization is an early step in sporadic tumor development. Together these results support the model of genomic instability being a cause rather than an effect of malignancy, facilitating vastly accelerated somatic cell evolution, with the observed orderly steps of the colon cancer progression pathway reflecting the consequences of natural selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The structures of the genes encoding the α1 and β1 subunits of murine soluble guanylyl cyclase (sGC) were determined. Full-length cDNAs isolated from mouse lungs encoding the α1 (2.5 kb) and β1 (3.3 kb) subunits are presented in this report. The α1 sGC gene is approximately 26.4 kb and contains nine exons, whereas the β1 sGC gene spans 22 kb and consists of 14 exons. The positions of exon/intron boundaries and the sizes of introns for both genes are described. Comparison of mouse genomic organization with the Human Genome Database predicted the exon/intron boundaries of the human genes and revealed that human and mouse α1 and β1 sGC genes have similar structures. Both mouse genes are localized on the third chromosome, band 3E3-F1, and are separated by a fragment that is 2% of the chromosomal length. The 5′ untranscribed regions of α1 and β1 subunit genes were subcloned into luciferase reporter constructs, and the functional analysis of promoter activity was performed in murine neuroblastoma N1E-115 cells. Our results indicate that the 5′ untranscribed regions for both genes possess independent promoter activities and, together with the data on chromosomal localization, suggest independent regulation of both genes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pairwise sequence comparison methods have been assessed using proteins whose relationships are known reliably from their structures and functions, as described in the scop database [Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia C. (1995) J. Mol. Biol. 247, 536–540]. The evaluation tested the programs blast [Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). J. Mol. Biol. 215, 403–410], wu-blast2 [Altschul, S. F. & Gish, W. (1996) Methods Enzymol. 266, 460–480], fasta [Pearson, W. R. & Lipman, D. J. (1988) Proc. Natl. Acad. Sci. USA 85, 2444–2448], and ssearch [Smith, T. F. & Waterman, M. S. (1981) J. Mol. Biol. 147, 195–197] and their scoring schemes. The error rate of all algorithms is greatly reduced by using statistical scores to evaluate matches rather than percentage identity or raw scores. The E-value statistical scores of ssearch and fasta are reliable: the number of false positives found in our tests agrees well with the scores reported. However, the P-values reported by blast and wu-blast2 exaggerate significance by orders of magnitude. ssearch, fasta ktup = 1, and wu-blast2 perform best, and they are capable of detecting almost all relationships between proteins whose sequence identities are >30%. For more distantly related proteins, they do much less well; only one-half of the relationships between proteins with 20–30% identity are found. Because many homologs have low sequence similarity, most distant relationships cannot be detected by any pairwise comparison method; however, those which are identified may be used with confidence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Here we present the successful application of the microarray technology platform to the analysis of DNA polymorphisms. Using the rice genome as a model, we demonstrate the potential of a high-throughput genome analysis method called Diversity Array Technology, DArT‘. In the format presented here the technology is assaying for the presence (or amount) of a specific DNA fragment in a representation derived from the total genomic DNA of an organism or a population of organisms. Two different approaches are presented: the first involves contrasting two representations on a single array while the second involves contrasting a representation with a reference DNA fragment common to all elements of the array. The Diversity Panels created using this method allow genetic fingerprinting of any organism or group of organisms belonging to the gene pool from which the panel was developed. Diversity Arrays enable rapid and economical application of a highly parallel, solid-state genotyping technology to any genome or complex genomic mixtures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The transformation-associated recombination (TAR) cloning technique allows selective and accurate isolation of chromosomal regions and genes from complex genomes. The technique is based on in vivo recombination between genomic DNA and a linearized vector containing homologous sequences, or hooks, to the gene of interest. The recombination occurs during transformation of yeast spheroplasts that results in the generation of a yeast artificial chromosome (YAC) containing the gene of interest. To further enhance and refine the TAR cloning technology, we determined the minimal size of a specific hook required for gene isolation utilizing the Tg.AC mouse transgene as a targeted region. For this purpose a set of vectors containing a B1 repeat hook and a Tg.AC-specific hook of variable sizes (from 20 to 800 bp) was constructed and checked for efficiency of transgene isolation by a radial TAR cloning. When vectors with a specific hook that was ≥60 bp were utilized, ∼2% of transformants contained circular YACs with the Tg.AC transgene sequences. Efficiency of cloning dramatically decreased when the TAR vector contained a hook of 40 bp or less. Thus, the minimal length of a unique sequence required for gene isolation by TAR is ∼60 bp. No transgene-positive YAC clones were detected when an ARS element was incorporated into a vector, demonstrating that the absence of a yeast origin of replication in a vector is a prerequisite for efficient gene isolation by TAR cloning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel database, under the acronym RISSC (Ribosomal Intergenic Spacer Sequence Collection), has been created. It compiles more than 1600 entries of edited DNA sequence data from the 16S–23S ribosomal spacers present in most prokaryotes and organelles (e.g. mitochondria and chloroplasts) and is accessible through the Internet (http://ulises.umh.es/RISSC), where systematic searches for specific words can be conducted, as well as BLAST-type sequence searches. Additionally, a characteristic feature of this region, the presence/absence and nature of tRNA genes within the spacer, is included in all the entries, even when not previously indicated in the original database. All these combined features could provide a useful documen­tation tool for studies on evolution, identification, typing and strain characterization, among others.