118 resultados para UCSC genome browser
Resumo:
We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.
Resumo:
Following transcription and splicing, each mRNA of a mammalian cell passes into the cytoplasm where its fate is in the hands of a complex network of ribonucleoproteins (mRNPs). The success or failure of a gene to be expressed depends on the performance of this mRNP infrastructure. The entry, gating, processing, and transit of each mRNA through an mRNP network helps determine the composition of a cell's proteome. The machinery that regulates storage, turnover, and translational activation of mRNAs is not well understood, in part, because of the heterogeneous nature of mRNPs. Recently, subsets of cellular mRNAs clustered as members of mRNP complexes have been identified by using antibodies reactive with RNA-binding proteins, including ELAV/Hu, eIF-4E, and poly(A)-binding proteins. Cytoplasmic ELAV/Hu proteins are involved in the stability and translation of early response gene (ERG) transcripts and are expressed predominately in neurons. mRNAs recovered from ELAV/Hu mRNP complexes were found to have similar sequence elements, suggesting a common structural linkage among them. This approach opens the possibility of identifying transcripts physically clustered in vivo that may have similar fates or functions. Moreover, the proteins encoded by physically organized mRNAs may participate in the same biological process or structural outcome, not unlike operons and their polycistronic mRNAs do in prokaryotic organisms. Our goal is to understand the organization and flow of genetic information on an integrative systems level by analyzing the collective properties of proteins and mRNAs associated with mRNPs in vivo.
Resumo:
The genome of the crenarchaeon Sulfolobus solfataricus P2 contains 2,992,245 bp on a single chromosome and encodes 2,977 proteins and many RNAs. One-third of the encoded proteins have no detectable homologs in other sequenced genomes. Moreover, 40% appear to be archaeal-specific, and only 12% and 2.3% are shared exclusively with bacteria and eukarya, respectively. The genome shows a high level of plasticity with 200 diverse insertion sequence elements, many putative nonautonomous mobile elements, and evidence of integrase-mediated insertion events. There are also long clusters of regularly spaced tandem repeats. Different transfer systems are used for the uptake of inorganic and organic solutes, and a wealth of intracellular and extracellular proteases, sugar, and sulfur metabolizing enzymes are encoded, as well as enzymes of the central metabolic pathways and motility proteins. The major metabolic electron carrier is not NADH as in bacteria and eukarya but probably ferredoxin. The essential components required for DNA replication, DNA repair and recombination, the cell cycle, transcriptional initiation and translation, but not DNA folding, show a strong eukaryal character with many archaeal-specific features. The results illustrate major differences between crenarchaea and euryarchaea, especially for their DNA replication mechanism and cell cycle processes and their translational apparatus.
Resumo:
We have modified the infectious reovirus RNA system so as to generate a reovirus reverse genetics system. The system consists of (i) the plus strands of nine wild-type reovirus genome segments; (ii) transcripts of the genetically modified cDNA form of the tenth genome segment; and (iii) a cell line transformed so as to express the protein normally encoded by the tenth genome segment. In the work described here, we have generated a serotype 3 reovirus into the S2 double-stranded RNA genome segment of which the CAT gene has been cloned. The virus is stable, replicates in cells that have been transformed (so as to express the S2 gene product, protein σ2), and expresses high levels of CAT activity. This technology can be extended to members of the orbivirus and rotavirus genera. This technology provides a powerful system for basic studies of double-stranded RNA virus replication; a nonpathogenic viral vector that replicates to high titers and could be used for clinical applications; and a system for providing nonselectable viral variants (the result of mutations, insertions, and deletions) that could be valuable for the construction of viral vaccine strains against human and animal pathogens.
Resumo:
Changes in DNA superhelicity during DNA replication are mediated primarily by the activities of DNA helicases and topoisomerases. If these activities are defective, the progression of the replication fork can be hindered or blocked, which can lead to double-strand breaks, elevated recombination in regions of repeated DNA, and genome instability. Hereditary diseases like Werner's and Bloom's Syndromes are caused by defects in DNA helicases, and these diseases are associated with genome instability and carcinogenesis in humans. Here we report a Saccharomyces cerevisiae gene, MGS1 (Maintenance of Genome Stability 1), which encodes a protein belonging to the AAA+ class of ATPases, and whose central region is similar to Escherichia coli RuvB, a Holliday junction branch migration motor protein. The Mgs1 orthologues are highly conserved in prokaryotes and eukaryotes. The Mgs1 protein possesses DNA-dependent ATPase and single-strand DNA annealing activities. An mgs1 deletion mutant has an elevated rate of mitotic recombination, which causes genome instability. The mgs1 mutation is synergistic with a mutation in top3 (encoding topoisomerase III), and the double mutant exhibits severe growth defects and markedly increased genome instability. In contrast to the mgs1 mutation, a mutation in the sgs1 gene encoding a DNA helicase homologous to the Werner and Bloom helicases suppresses both the growth defect and the increased genome instability of the top3 mutant. Therefore, evolutionarily conserved Mgs1 may play a role together with RecQ family helicases and DNA topoisomerases in maintaining proper DNA topology, which is essential for genome stability.
Resumo:
Gene targeting in mammalian cells has proven invaluable in biotechnology, in studies of gene structure and function, and in understanding chromosome dynamics. It also offers a potential tool for gene-therapeutic applications. Two limitations constrain the current technology: the low rate of homologous recombination in mammalian cells and the high rate of random (nontargeted) integration of the vector DNA. Here we consider possible ways to overcome these limitations within the framework of our present understanding of recombination mechanisms and machinery. Several studies suggest that transient alteration of the levels of recombination proteins, by overexpression or interference with expression, may be able to increase homologous recombination or decrease random integration, and we present a list of candidate genes. We consider potentially beneficial modifications to the vector DNA and discuss the effects of methods of DNA delivery on targeting efficiency. Finally, we present work showing that gene-specific DNA damage can stimulate local homologous recombination, and we discuss recent results with two general methodologies—chimeric nucleases and triplex-forming oligonucleotides—for stimulating recombination in cells.
Resumo:
We have analyzed the developmental molecular programs of the mouse hippocampus, a cortical structure critical for learning and memory, by means of large-scale DNA microarray techniques. Of 11,000 genes and expressed sequence tags examined, 1,926 showed dynamic changes during hippocampal development from embryonic day 16 to postnatal day 30. Gene-cluster analysis was used to group these genes into 16 distinct clusters with striking patterns that appear to correlate with major developmental hallmarks and cellular events. These include genes involved in neuronal proliferation, differentiation, and synapse formation. A complete list of the transcriptional changes has been compiled into a comprehensive gene profile database (http://BrainGenomics.Princeton.edu), which should prove valuable in advancing our understanding of the molecular and genetic programs underlying both the development and the functions of the mammalian brain.
Resumo:
The bronze (bz) locus exhibits the highest rate of recombination of any gene in higher plants. To investigate the possible basis of this high rate of recombination, we have analyzed the physical organization of the region around the bz locus. Two adjacent bacterial artificial chromosome clones, comprising a 240-kb contig centered around the Bz-McC allele, were isolated, and 60 kb of contiguous DNA spanning the two bacterial artificial chromosome clones was sequenced. We find that the bz locus lies in an unusually gene-rich region of the maize genome. Ten genes, at least eight of which are shown to be transcribed, are contained in a 32-kb stretch of DNA that is uninterrupted by retrotransposons. We have isolated nearly full length cDNAs corresponding to the five proximal genes in the cluster. The average intertranscript distance between them is just 1 kb, revealing a surprisingly compact packaging of adjacent genes in this part of the genome. At least 11 small insertions, including several previously described miniature inverted repeat transposable elements, were detected in the introns and 3′ untranslated regions of genes and between genes. The gene-rich region is flanked at the proximal and distal ends by retrotransposon blocks. Thus, the maize genome appears to have scattered regions of high gene density similar to those found in other plants. The unusually high rate of intragenic recombination seen in bz may be related to the very high gene density of the region.
Resumo:
The full sequence of the genome-linked viral protein (VPg) cistron located in the central part of potato virus Y (common strain) genome has been identified. The VPg gene codes for a protein of 188 amino acids, with significant homology to other known potyviral VPg polypeptides. A three-dimensional model structure of VPg is proposed on the basis of similarity of hydrophobic-hydrophilic residue distribution to the sequence of malate dehydrogenase of known crystal structure. The 5' end of the viral RNA can be fitted to interact with the protein through the exposed hydroxyl group of Tyr-64, in agreement with experimental data. The complex favors stereochemically the formation of a phosphodiester bond [5'-(O4-tyrosylphospho)adenylate] typical for representatives of picornavirus-like viruses. The chemical mechanisms of viral RNA binding to VPg are discussed on the basis of the model structure of protein-RNA complex.
Resumo:
The intercistronic region between the maturation and coat-protein genes of RNA phage MS2 contains important regulatory and structural information. The sequence participates in two adjacent stem-loop structures, one of which, the coat-initiator hairpin, controls coat-gene translation and is thus under strong selection pressure. We have removed 19 out of the 23 nucleotides constituting the intercistronic region, thereby destroying the capacity of the phage to build the two hairpins. The deletion lowered coat-protein yield more than 1000-fold, and the titer of the infectious clone carrying the deletion dropped 10 orders of magnitude as compared with the wild type. Two types of revertants were recovered. One had, in two steps, recruited 18 new nucleotides that served to rebuild the two hairpins and the lost Shine-Dalgarno sequence. The other type had deleted an additional six nucleotides, which allowed the reconstruction of the Shine-Dalgarno sequence and the initiator hairpin, albeit by sacrificing the remnants of the other stem-loop. The results visualize the immense genetic repertoire created by, what appears as, random RNA recombination. It would seem that in this genetic ensemble every possible new RNA combination is represented.
Resumo:
The complete nucleotide sequence, 5178 bp, of the totivirus Helminthosporium vicotoriae 190S virus (Hv190SV) double-stranded RNA, was determined. Computer-assisted sequence analysis revealed the presence of two large overlapping ORFs; the 5'-proximal large ORF (ORF1) codes for the coat protein (CP) with a predicted molecular mass of 81 kDa, and the 3'-proximal ORF (ORF2), which is in the -1 frame relative to ORF1, codes for an RNA-dependent RNA polymerase (RDRP). Unlike many other totiviruses, the overlap region between ORF1 and ORF2 lacks known structural information required for translational frameshifting. Using an antiserum to a C-terminal fragment of the RDRP, the product of ORF2 was identified as a minor virion-associated polypeptide of estimated molecular mass of 92 kDa. No CP-RDRP fusion protein with calculated molecular mass of 165 kDa was detected. The predicted start codon of the RDRP ORF (2605-AUG-2607) overlaps with the stop codon (2606-UGA-2608) of the CP ORF, suggesting RDRP is expressed by an internal initiation mechanism. Hv190SV is associated with a debilitating disease of its phytopathogenic fungal host. Knowledge of its genome organization and expression will be valuable for understanding its role in pathogenesis and for potential exploitation in the development of biocontrol measures.
Resumo:
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery.
Resumo:
We have developed a system for generation of infectious bursal disease virus (IBDV), a segmented double-stranded RNA virus of the Birnaviridae family, with the use of synthetic transcripts derived from cloned cDNA. Independent full-length cDNA clones were constructed that contained the entire coding and noncoding regions of RNA segments A and B of two distinguishable IBDV strains of serotype I. Segment A encodes all of the structural (VP2, VP4, and VP3) and nonstructural (VP5) proteins, whereas segment B encodes the RNA-dependent RNA polymerase (VP1). Synthetic RNAs of both segments were produced by in vitro transcription of linearized plasmids with T7 RNA polymerase. Transfection of Vero cells with combined plus-sense transcripts of both segments generated infectious virus as early as 36 hr after transfection. The infectivity and specificity of the recovered chimeric virus was ascertained by the appearance of cytopathic effect in chicken embryo cells, by immunofluorescence staining of infected Vero cells with rabbit anti-IBDV serum, and by nucleotide sequence analysis of the recovered virus, respectively. In addition, transfectant viruses containing genetically tagged sequences in either segment A or segment B of IBDV were generated to confirm the feasibility of this system. The development of a reverse genetics system for double-stranded RNA viruses will greatly facilitate studies of the regulation of viral gene expression, pathogenesis, and design of a new generation of live vaccines.