20 resultados para Exome sequencing
em National Center for Biotechnology Information - NCBI
Resumo:
The pufferfish Fugu rubripes has a genome ≈7.5 times smaller than that of mammals but with a similar number of genes. Although conserved synteny has been demonstrated between pufferfish and mammals across some regions of the genome, there is some controversy as to what extent Fugu will be a useful model for the human genome, e.g., [Gilley, J., Armes, N. & Fried, M. (1997) Nature (London) 385, 305–306]. We report extensive conservation of synteny between a 1.5-Mb region of human chromosome 11 and <100 kb of the Fugu genome in three overlapping cosmids. Our findings support the idea that the majority of DNA in the region of human chromosome 11p13 is intergenic. Comparative analysis of three unrelated genes with quite different roles, WT1, RCN1, and PAX6, has revealed differences in their structural evolution. Whereas the human WT1 gene can generate 16 protein isoforms via a combination of alternative splicing, RNA editing, and alternative start site usage, our data predict that Fugu WT1 is capable of generating only two isoforms. This raises the question of the extent to which the evolution of WT1 isoforms is related to the evolution of the mammalian genitourinary system. In addition, this region of the Fugu genome shows a much greater overall compaction than usual but with significant noncoding homology observed at the PAX6 locus, implying that comparative genomics has identified regulatory elements associated with this gene.
Resumo:
A loxP-transposon retrofitting strategy for generating large nested deletions from one end of the insert DNA in bacterial artificial chromosomes and P1 artificial chromosomes was described recently [Chatterjee, P. K. & Coren, J. S. (1997) Nucleic Acids Res. 25, 2205–2212]. In this report, we combine this procedure with direct sequencing of nested-deletion templates by using primers located in the transposon end to illustrate its value for position-specific single-nucleotide polymorphism (SNP) discovery from chosen regions of large insert clones. A simple ampicillin sensitivity screen was developed to facilitate identification and recovery of deletion clones free of transduced transposon plasmid. This directed approach requires minimal DNA sequencing, and no in vitro subclone library generation; positionally oriented SNPs are a consequence of the method. The procedure is used to discover new SNPs as well as physically map those identified from random subcloned libraries or sequence databases. The deletion templates, positioned SNPs, and markers are also used to orient large insert clones into a contig. The deletion clone can serve as a ready resource for future functional genomic studies because each carries a mammalian cell-specific antibiotic resistance gene from the transposon. Furthermore, the technique should be especially applicable to the analysis of genomes for which a full genome sequence or radiation hybrid cell lines are unavailable.
Resumo:
Pax proteins are a family of transcription factors with a highly conserved paired domain; many members also contain a paired-type homeodomain and/or an octapeptide. Nine mammalian Pax genes are known and classified into four subgroups: Pax-1/9, Pax-2/5/8, Pax-3/7, and Pax-4/6. Most of these genes are involved in nervous system development. In particular, Pax-6 is a key regulator that controls eye development in vertebrates and Drosophila. Although the Pax-4/6 subgroup seems to be more closely related to Pax-2/5/8 than to Pax-3/7 or Pax-1/9, its evolutionary origin is unknown. We therefore searched for a Pax-6 homolog and related genes in Cnidaria, which is the lowest phylum of animals that possess a nervous system and eyes. A sea nettle (a jellyfish) genomic library was constructed and two pax genes (Pax-A and -B) were isolated and partially sequenced. Surprisingly, unlike most known Pax genes, the paired box in these two genes contains no intron. In addition, the complete cDNA sequences of hydra Pax-A and -B were obtained. Hydra Pax-B contains both the homeodomain and the octapeptide, whereas hydra Pax-A contains neither. DNA binding assays showed that sea nettle Pax-A and -B and hydra Pax-A paired domains bound to a Pax-5/6 site and a Pax-5 site, although hydra Pax-B paired domain bound neither. An alignment of all available paired domain sequences revealed two highly conserved regions, which cover the DNA binding contact positions. Phylogenetic analysis showed that Pax-A and especially Pax-B were more closely related to Pax-2/5/8 and Pax-4/6 than to Pax-1/9 or Pax-3/7 and that the Pax genes can be classified into two supergroups: Pax-A/Pax-B/Pax-2/5/8/4/6 and Pax-1/9/3/7. From this analysis and the gene structure, we propose that modern Pax-4/6 and Pax-2/5/8 genes evolved from an ancestral gene similar to cnidarian Pax-B, having both the homeodomain and the octapeptide.
Resumo:
Multiple-complete-digest mapping is a DNA mapping technique based on complete-restriction-digest fingerprints of a set of clones that provides highly redundant coverage of the mapping target. The maps assembled from these fingerprints order both the clones and the restriction fragments. Maps are coordinated across three enzymes in the examples presented. Starting with yeast artificial chromosome contigs from the 7q31.3 and 7p14 regions of the human genome, we have produced cosmid-based maps spanning more than one million base pairs. Each yeast artificial chromosome is first subcloned into cosmids at a redundancy of ×15–30. Complete-digest fragments are electrophoresed on agarose gels, poststained, and imaged on a fluorescent scanner. Aberrant clones that are not representative of the underlying genome are rejected in the map construction process. Almost every restriction fragment is ordered, allowing selection of minimal tiling paths with clone-to-clone overlaps of only a few thousand base pairs. These maps demonstrate the practicality of applying the experimental and software-based steps in multiple-complete-digest mapping to a target of significant size and complexity. We present evidence that the maps are sufficiently accurate to validate both the clones selected for sequencing and the sequence assemblies obtained once these clones have been sequenced by a “shotgun” method.
Resumo:
An mAb was raised to the C5 phagosomal antigen in Paramecium multimicronucleatum. To determine its function, the cDNA and genomic DNA encoding C5 were cloned. This antigen consisted of 315 amino acid residues with a predicted molecular weight of 36,594, a value similar to that determined by SDS-PAGE. Sequence comparisons uncovered a low but significant homology with a Schizosaccharomyces pombe protein and the C-terminal half of the β-fructofuranosidase protein of Zymomonas mobilis. Lacking an obvious transmembrane domain or a possible signal sequence at the N terminus, C5 was predicted to be a soluble protein, whereas immunofluorescence data showed that it was present on the membranes of vesicles and digestive vacuoles (DVs). In cells that were minimally permeabilized but with intact DVs, C5 was found to be located on the cytosolic surface of the DV membranes. Immunoblotting of proteins from the purified and KCl-washed DVs showed that C5 was tightly bound to the DV membranes. Cryoelectron microscopy also confirmed that C5 was on the cytosolic surface of the discoidal vesicles, acidosomes, and lysosomes, organelles known to fuse with the membranes of the cytopharynx, the DVs of stages I (DV-I) and II (DV-II), respectively. Although C5 was concentrated more on the mature than on the young DV membranes, the striking observation was that the cytopharyngeal membrane that is derived from the discoidal vesicles was almost devoid of C5. Approximately 80% of the C5 was lost from the discoidal vesicle-derived membrane after this membrane fused with the cytopharyngeal membrane. Microinjection of the mAb to C5 greatly inhibited the fusion of the discoidal vesicles with the cytopharyngeal membrane and thus the incorporation of the discoidal vesicle membranes into the DV membranes. Taken together, these results suggest that C5 is a membrane protein that is involved in binding and/or fusion of the discoidal vesicles with the cytopharyngeal membrane that leads to DV formation.
Resumo:
We report automated DNA sequencing in 16-channel microchips. A microchip prefilled with sieving matrix is aligned on a heating plate affixed to a movable platform. Samples are loaded into sample reservoirs by using an eight-tip pipetting device, and the chip is docked with an array of electrodes in the focal plane of a four-color scanning detection system. Under computer control, high voltage is applied to the appropriate reservoirs in a programmed sequence that injects and separates the DNA samples. An integrated four-color confocal fluorescent detector automatically scans all 16 channels. The system routinely yields more than 450 bases in 15 min in all 16 channels. In the best case using an automated base-calling program, 543 bases have been called at an accuracy of >99%. Separations, including automated chip loading and sample injection, normally are completed in less than 18 min. The advantages of DNA sequencing on capillary electrophoresis chips include uniform signal intensity and tolerance of high DNA template concentration. To understand the fundamentals of these unique features we developed a theoretical treatment of cross-channel chip injection that we call the differential concentration effect. We present experimental evidence consistent with the predictions of the theory.
Resumo:
A de novo sequencing program for proteins is described that uses tandem MS data from electron capture dissociation and collisionally activated dissociation of electrosprayed protein ions. Computer automation is used to convert the fragment ion mass values derived from these spectra into the most probable protein sequence, without distinguishing Leu/Ile. Minimum human input is necessary for the data reduction and interpretation. No extra chemistry is necessary to distinguish N- and C-terminal fragments in the mass spectra, as this is determined from the electron capture dissociation data. With parts-per-million mass accuracy (now available by using higher field Fourier transform MS instruments), the complete sequences of ubiquitin (8.6 kDa) and melittin (2.8 kDa) were predicted correctly by the program. The data available also provided 91% of the cytochrome c (12.4 kDa) sequence (essentially complete except for the tandem MS-resistant region K13–V20 that contains the cyclic heme). Uncorrected mass values from a 6-T instrument still gave 86% of the sequence for ubiquitin, except for distinguishing Gln/Lys. Extensive sequencing of larger proteins should be possible by applying the algorithm to pieces of ≈10-kDa size, such as products of limited proteolysis.
Resumo:
Heparin- and heparan sulfate-like glycosaminoglycans (HLGAGs) represent an important class of molecules that interact with and modulate the activity of growth factors, enzymes, and morphogens. Of the many biological functions for this class of molecules, one of its most important functions is its interaction with antithrombin III (AT-III). AT-III binding to a specific heparin pentasaccharide sequence, containing an unusual 3-O sulfate on a N-sulfated, 6-O sulfated glucosamine, increases 1,000-fold AT-III's ability to inhibit specific proteases in the coagulation cascade. In this manner, HLGAGs play an important biological and pharmacological role in the modulation of blood clotting. Recently, a sequencing methodology was developed to further structure-function relationships of this important class of molecules. This methodology combines a property-encoded nomenclature scheme to handle the large information content (properties) of HLGAGs, with matrix-assisted laser desorption ionization MS and enzymatic and chemical degradation as experimental constraints to rapidly sequence picomole quantities of HLGAG oligosaccharides. Using the above property-encoded nomenclature-matrix-assisted laser desorption ionization approach, we found that the sequence of the decasaccharide used in this study is ΔU2SHNS,6SI2SHNS,6SI2SHNS,6SIHNAc,6SGHNS,3S,6S (±DDD4–7). We confirmed our results by using integral glycan sequencing and one-dimensional proton NMR. Furthermore, we show that this approach is flexible and is able to derive sequence information on an oligosaccharide mixture. Thus, this methodology will make possible both the analysis of other unusual sequences in HLGAGs with important biological activity as well as provide the basis for the structural analysis of these pharamacologically important group of heparin/heparan sulfates.
Resumo:
The proliferation of various tumors is inhibited by the antagonists of growth hormone-releasing hormone (GHRH) in vitro and in vivo, but the receptors mediating the effects of GHRH antagonists have not been identified so far. Using an approach based on PCR, we detected two major splice variants (SVs) of mRNA for human GHRH receptor (GHRH-R) in human cancer cell lines, including LNCaP prostatic, MiaPaCa-2 pancreatic, MDA-MB-468 breast, OV-1063 ovarian, and H-69 small-cell lung carcinomas. In addition, high-affinity, low-capacity binding sites for GHRH antagonists were found on the membranes of cancer cell lines such as MiaPaCa-2 that are negative for the vasoactive intestinal peptide/pituitary adenylate cyclase-activating polypeptide receptor (VPAC-R) or lines such as LNCaP that are positive for VPAC-R. Sequence analysis of cDNAs revealed that the first three exons in SV1 and SV2 are replaced by a fragment of retained intron 3 having a new putative in-frame start codon. The rest of the coding region of SV1 is identical to that of human pituitary GHRH-R, whereas in SV2 exon 7 is spliced out, resulting in a 1-nt upstream frameshift, which leads to a premature stop codon in exon 8. The intronic sequence may encode a distinct 25-aa fragment of the N-terminal extracellular domain, which could serve as a proposed signal peptide. The continuation of the deduced protein sequence coded by exons 4–13 in SV1 is identical to that of pituitary GHRH-R. SV2 may encode a GHRH-R isoform truncated after the second transmembrane domain. Thus SVs of GHRH-Rs have now been identified in human extrapituitary cells. The findings support the view that distinct receptors are expressed on human cancer cells, which may mediate the antiproliferative effect of GHRH antagonists.
Resumo:
We describe a fluorescence-based directed termination PCR (fluorescent DT–PCR) that allows accurate determination of actual sequence changes without dideoxy DNA sequencing. This is achieved using near infrared dye-labeled primers and performing two PCR reactions under low and unbalanced dNTP concentrations. Visualization of resulting termination fragments is accomplished with a dual dye Li-cor DNA sequencer. As each DT–PCR reaction generates two sets of terminating fragments, a pair of complementary reactions with limiting dATP and dCTP collectively provide information on the entire sequence of a target DNA, allowing an accurate determination of any base change. Blind analysis of 78 mutants of the supF reporter gene using fluorescent DT–PCR not only correctly determined the nature and position of all types of substitution mutations in the supF gene, but also allowed rapid scanning of the signature sequences among identical mutations. The method provides simplicity in the generation of terminating fragments and 100% accuracy in mutation characterization. Fluorescent DT–PCR was successfully used to generate a UV-induced spectrum of mutations in the supF gene following replication on a single plate of human DNA repair-deficient cells. We anticipate that the automated DT–PCR method will serve as a cost-effective alternative to dideoxy sequencing in studies involving large-scale analysis for nucleotide sequence changes.
Resumo:
Sets of RNA ladders can be synthesized by transcription of a bacteriophage-encoded RNA polymerase using 3′-deoxynucleotides as chain terminators. These ladders can be used for sequencing of DNA. Using a nicked form of phage SP6 RNA polymerase in this study substantially enhanced yields of transcriptional sequencing ladders. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) of chain-terminated RNA ladders allowed DNA sequence determination of up to 56 nt. It is also demonstrated that A→G and C→T variations in heterozygous and homozygous samples can be unambiguously identified by the mass spectrometric analysis. As a step towards single-tube sequencing reactions, α-thiotriphosphate nucleotide analogs were used to overcome problems caused by chain terminator-independent, premature termination and by the small mass difference between natural pyrimidine nucleotides.
Resumo:
Edman degradation remains the primary method for determining the sequence of proteins. In this study, accelerator mass spectrometry was used to determine the N-terminal sequence of glutathione S-transferase at the attomole level with zeptomole precision using a tracer of 14C. The transgenic transferase was labeled by growing transformed Escherichia coli on [14C]glucose and purified by microaffinity chromatography. An internal standard of peptides on a solid phase synthesized to release approximately equal amounts of all known amino acids with each cycle were found to increase yield of gas phase sequencing reactions and subsequent semimicrobore HPLC as did a lactoglobulin carrier. This method is applicable to the sequencing of proteins from cell culture and illustrates a path to more general methods for determining N-terminal sequences with high sensitivity.
Resumo:
Previously conducted sequence analysis of Arabidopsis thaliana (ecotype Columbia-0) reported an insertion of 270-kb mtDNA into the pericentric region on the short arm of chromosome 2. DNA fiber-based fluorescence in situ hybridization analyses reveal that the mtDNA insert is 618 ± 42 kb, ≈2.3 times greater than that determined by contig assembly and sequencing analysis. Portions of the mitochondrial genome previously believed to be absent were identified within the insert. Sections of the mtDNA are repeated throughout the insert. The cytological data illustrate that DNA contig assembly by using bacterial artificial chromosomes tends to produce a minimal clone path by skipping over duplicated regions, thereby resulting in sequencing errors. We demonstrate that fiber-fluorescence in situ hybridization is a powerful technique to analyze large repetitive regions in the higher eukaryotic genomes and is a valuable complement to ongoing large genome sequencing projects.
Resumo:
Matrix-assisted laser desorption/ionization (MALDI) time of flight mass spectrometry was used to detect and order DNA fragments generated by Sanger dideoxy cycle sequencing. This was accomplished by improving the sensitivity and resolution of the MALDI method using a delayed ion extraction technique (DE-MALDI). The cycle sequencing chemistry was optimized to produce as much as 100 fmol of each specific dideoxy terminated fragment, generated from extension of a 13-base primer annealed on 40- and 50-base templates. Analysis of the resultant sequencing mixture by DE-MALDI identified the appropriate termination products. The technique provides a new non-gel-based method to sequence DNA which may ultimately have considerable speed advantages over traditional methodologies.