109 resultados para COMPLETE NUCLEOTIDE-SEQUENCE
em National Center for Biotechnology Information - NCBI
Resumo:
The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor.
Resumo:
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.
Resumo:
We present here the complete genome sequence of a common avian clone of Pasteurella multocida, Pm70. The genome of Pm70 is a single circular chromosome 2,257,487 base pairs in length and contains 2,014 predicted coding regions, 6 ribosomal RNA operons, and 57 tRNAs. Genome-scale evolutionary analyses based on pairwise comparisons of 1,197 orthologous sequences between P. multocida, Haemophilus influenzae, and Escherichia coli suggest that P. multocida and H. influenzae diverged ≈270 million years ago and the γ subdivision of the proteobacteria radiated about 680 million years ago. Two previously undescribed open reading frames, accounting for ≈1% of the genome, encode large proteins with homology to the virulence-associated filamentous hemagglutinin of Bordetella pertussis. Consistent with the critical role of iron in the survival of many microbial pathogens, in silico and whole-genome microarray analyses identified more than 50 Pm70 genes with a potential role in iron acquisition and metabolism. Overall, the complete genomic sequence and preliminary functional analyses provide a foundation for future research into the mechanisms of pathogenesis and host specificity of this important multispecies pathogen.
Resumo:
The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living α-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.
Resumo:
The 1,852,442-bp sequence of an M1 strain of Streptococcus pyogenes, a Gram-positive pathogen, has been determined and contains 1,752 predicted protein-encoding genes. Approximately one-third of these genes have no identifiable function, with the remainder falling into previously characterized categories of known microbial function. Consistent with the observation that S. pyogenes is responsible for a wider variety of human disease than any other bacterial species, more than 40 putative virulence-associated genes have been identified. Additional genes have been identified that encode proteins likely associated with microbial “molecular mimicry” of host characteristics and involved in rheumatic fever or acute glomerulonephritis. The complete or partial sequence of four different bacteriophage genomes is also present, with each containing genes for one or more previously undiscovered superantigen-like proteins. These prophage-associated genes encode at least six potential virulence factors, emphasizing the importance of bacteriophages in horizontal gene transfer and a possible mechanism for generating new strains with increased pathogenic potential.
Resumo:
Clones encoding pro-phenol oxidase [pro-PO; zymogen of phenol oxidase (monophenol, L-dopa:oxygen oxidoreductase, EC 1.14.18.1)] A1 were isolated from a lambda gt10 library that originated from Drosophila melanogaster strain Oregon-R male adults. The 2294 bp of the cDNA included a 13-bp 5'-noncoding region, a 2070-bp encoding open reading frame of 690 amino acids, and a 211-bp 3'-noncoding region. A hydrophobic NH2-terminal sequence for a signal peptide is absent in the protein. Furthermore, there are six potential N-glycosylation sites in the sequence, but no amino sugar was detected in the purified protein by amino acid analysis, indicating the lack of an N-linked sugar chain. The potential copper-binding sites, amino acids 200-248 and 359-414, are highly homologous to the corresponding sites of hemocyanin of the tarantula Eurypelma californicum, the horseshoe crab Limulus polyphemus, and the spiny lobster Panulirus interruptus. On the basis of the phylogenetic tree constructed by the neighbor-joining method, vertebrate tyrosinases and molluscan hemocyanins constitute one family, whereas pro-POs and arthropod hemocyanins group with another family. It seems, therefore, likely that pro-PO originates from a common ancestor with arthropod hemocyanins, independently to the vertebrate and microbial tyrosinases.
Resumo:
Chromosome I from the yeast Saccharomyces cerevisiae contains a DNA molecule of approximately 231 kbp and is the smallest naturally occurring functional eukaryotic nuclear chromosome so far characterized. The nucleotide sequence of this chromosome has been determined as part of an international collaboration to sequence the entire yeast genome. The chromosome contains 89 open reading frames and 4 tRNA genes. The central 165 kbp of the chromosome resembles other large sequenced regions of the yeast genome in both its high density and distribution of genes. In contrast, the remaining sequences flanking this DNA that comprise the two ends of the chromosome and make up more than 25% of the DNA molecule have a much lower gene density, are largely not transcribed, contain no genes essential for vegetative growth, and contain several apparent pseudogenes and a 15-kbp redundant sequence. These terminally repetitive regions consist of a telomeric repeat called W', flanked by DNA closely related to the yeast FLO1 gene. The low gene density, presence of pseudogenes, and lack of expression are consistent with the idea that these terminal regions represent the yeast equivalent of heterochromatin. The occurrence of such a high proportion of DNA with so little information suggests that its presence gives this chromosome the critical length required for proper function.
Resumo:
Prophenoloxidase, a melanin-synthesizing enzyme, is considered to be an important arthropod immune protein. In mosquitoes, prophenoloxidase has been shown to be involved in refractory mechanisms against malaria parasites. In our study we used Anopheles gambiae, the most important human malaria vector, to characterize the first arthropod prophenoloxidase gene at the genomic level. The complete nucleotide sequence, including the immediate 5′ flanking sequence (−855 bp) of the prophenoloxidase 1 gene, was determined. The gene spans 10 kb and is composed of five exons and four introns coding for a 2.5-kb mRNA. In the 5′ flanking sequence, we found several putative regulatory motifs, two of which were identified as ecdysteroid regulatory elements. Electrophoretic mobility gel-shift assays and supershift assays demonstrated that the Aedes aegypti ecdysone receptor/Ultraspiracle nuclear receptor complex, and, seemingly, the endogenous Anopheles gambiae nuclear receptor complex, was able to bind one of the ecdysteroid response elements. Furthermore, 20-hydroxyecdysone stimulation was shown to up-regulate the transcription of the prophenoloxidase 1 gene in an A. gambiae cell line.
Resumo:
The complete nucleotide sequence, 5178 bp, of the totivirus Helminthosporium vicotoriae 190S virus (Hv190SV) double-stranded RNA, was determined. Computer-assisted sequence analysis revealed the presence of two large overlapping ORFs; the 5'-proximal large ORF (ORF1) codes for the coat protein (CP) with a predicted molecular mass of 81 kDa, and the 3'-proximal ORF (ORF2), which is in the -1 frame relative to ORF1, codes for an RNA-dependent RNA polymerase (RDRP). Unlike many other totiviruses, the overlap region between ORF1 and ORF2 lacks known structural information required for translational frameshifting. Using an antiserum to a C-terminal fragment of the RDRP, the product of ORF2 was identified as a minor virion-associated polypeptide of estimated molecular mass of 92 kDa. No CP-RDRP fusion protein with calculated molecular mass of 165 kDa was detected. The predicted start codon of the RDRP ORF (2605-AUG-2607) overlaps with the stop codon (2606-UGA-2608) of the CP ORF, suggesting RDRP is expressed by an internal initiation mechanism. Hv190SV is associated with a debilitating disease of its phytopathogenic fungal host. Knowledge of its genome organization and expression will be valuable for understanding its role in pathogenesis and for potential exploitation in the development of biocontrol measures.
Resumo:
In immature T cells the T-cell receptor (TCR) beta-chain gene is rearranged and expressed before the TCR alpha-chain gene. At this stage TCR beta chain can form disulfide-linked heterodimers with the pre-T-cell receptor alpha chain (pTalpha). Using the recently isolated murine pTalpha cDNA as a probe, we have isolated the human pTalpha cDNA. The complete nucleotide sequence predicts a mature protein of 282 aa consisting of an extracellular immunoglobulin-like domain, a connecting peptide, a transmembrane region, and a long cytoplasmic tail. Amino acid sequence comparison of human pTalpha with the mouse pTalpha molecule reveals high sequence homology in the extracellular as well as the transmembrane region. In contrast, the cytoplasmic region differs in amino acid composition and in length from the murine homologue. The human pTalpha gene is expressed in immature but not mature T cells and is located at the p21.2-p12 region of the short arm of chromosome 6.
Resumo:
The nucleotide sequence of the human alpha-albumin gene, including 887 bp of the 5'-flanking region and 1311 bp of the 3-flanking region (24,454 in total), was determined from three overlapping lambda phage clones. The sequence spans 22,256 bp from the cap site to the polyadenylylation site, revealing a gene structure of 15 exons separated by 14 introns. The methionine initiation codon ATG is within exon 1; the termination codon TGA is within exon 14. Exon 15 is entirely untranslated and contains the polyadenylylation signal AATAAA. The deduced polypeptide chain is composed of a 21-amino-acid leader peptide, followed by 578 amino acids of the mature protein. There are seven repetitive DNA elements (Alu and Kpn) in the introns and 3-flanking region. The sizes of the 15 alpha-albumin exons match closely those of the albumin, alpha-fetoprotein, and vitamin D-binding protein genes. The exons are symmetrically placed within the three domains of the individual proteins, and they share a characteristic codon splitting pattern that is conserved among members of the gene family. The results provide strong evidence that alpha-albumin belongs to, and most likely completes with, the serum albumin gene family. Based on structural similarity, alpha-albumin appears to be most closely related to alpha-fetoprotein. The complete structure of this family of four tandemly linked genes provides a well-characterized approximately 200 kb locus in the 4q subcentromeric region of the human genome.
Resumo:
Variations in regulatory regions of developmental control genes have been implicated in the divergence of axial morphologies. To find potentially significant changes in cis-regulatory regions, we compared nucleotide sequences and activities of mammalian Hoxc8 early enhancers. The nucleotide sequence of the early enhancer region is extremely conserved among mammalian clades, with five previously described cis-acting elements, A–E, being invariant. However, a 4-bp deletion within element C of the Hoxc8 early enhancer sequence is observed in baleen whales. When assayed in transgenic mouse embryos, a baleen whale enhancer (unlike other mammalian enhancers) directs expression of the reporter gene to more posterior regions of the neural tube but fails to direct expression to posterior mesoderm. We suggest that regulation of Hoxc8 in baleen whales differs from other mammalian species and may be associated with variation in axial morphology.
Resumo:
We report the properties of the new BseMII restriction and modification enzymes from Bacillus stearothermophilus Isl 15-111, which recognize the 5′-CTCAG sequence, and the nucleotide sequence of the genes encoding them. The restriction endonuclease R.BseMII makes a staggered cut at the tenth base pair downstream of the recognition sequence on the upper strand, producing a two base 3′-protruding end. Magnesium ions and S-adenosyl-l-methionine (AdoMet) are required for cleavage. S-adenosylhomocysteine and sinefungin can replace AdoMet in the cleavage reaction. The BseMII methyltransferase modifies unique adenine residues in both strands of the target sequence 5′-CTCAG-3′/5′-CTGAG-3′. Monomeric R.BseMII in addition to endonucleolytic activity also possesses methyltransferase activity that modifies the A base only within the 5′-CTCAG strand of the target duplex. The deduced amino acid sequence of the restriction endonuclease contains conserved motifs of DNA N6-adenine methylases involved in S-adenosyl-l-methionine binding and catalysis. According to its structure and enzymatic properties, R.BseMII may be regarded as a representative of the type IV restriction endonucleases.
Resumo:
We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an ∼25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat.
Resumo:
Equine rhinovirus 1 (ERhV1) is a respiratory pathogen of horses which has an uncertain taxonomic status. We have determined the nucleotide sequence of the ERhV1 genome except for a small region at the 5' end. The predicted polyprotein was encoded by 6741 nucleotides and possessed a typical picornavirus proteolytic cleavage pattern, including a leader polypeptide. The genomic structure and predicted amino acid sequence of ERhV1 were more similar to those of foot-and-mouth disease viruses (FMDVs), the only members of the aphthovirus genus, than to those of other picornaviruses. Features which were most similar to FMDV included a 16-amino acid 2A protein which was 87.5% identical in sequence of FMDV 2A, a leader (L) protein similar in size to FMDV Lab and the possibility of a truncated L protein similar in size to FMDV Lb, and a 3C protease which recognizes different cleavage sites. However, unlike FMDV, ERhV1 had only one copy of the 3B (VPg) polypeptide. The phylogenetic relationships of the ERhV1 sequence and nucleotide sequences of representative species of the five genera of the family Picornaviridae were examined. Nucleotide sequences coding for the complete polyprotein, the RNA polymerase, and VP1 were analyzed separately. The phylogenetic trees confirmed that ERhV1 was more closely related to FMDV than to other picornaviruses and suggested that ERhV1 may be a member, albeit very distant, of the aphthovirus genus.