10 resultados para Molecular sequence data
em National Center for Biotechnology Information - NCBI
Resumo:
Molecular and fragment ion data of intact 8- to 43-kDa proteins from electrospray Fourier-transform tandem mass spectrometry are matched against the corresponding data in sequence data bases. Extending the sequence tag concept of Mann and Wilm for matching peptides, a partial amino acid sequence in the unknown is first identified from the mass differences of a series of fragment ions, and the mass position of this sequence is defined from molecular weight and the fragment ion masses. For three studied proteins, a single sequence tag retrieved only the correct protein from the data base; a fourth protein required the input of two sequence tags. However, three of the data base proteins differed by having an extra methionine or by missing an acetyl or heme substitution. The positions of these modifications in the protein examined were greatly restricted by the mass differences of its molecular and fragment ions versus those of the data base. To characterize the primary structure of an unknown represented in the data base, this method is fast and specific and does not require prior enzymatic or chemical degradation.
Resumo:
The Plasmodium falciparum Genome Database (http://PlasmoDB.org) integrates sequence information, automated analyses and annotation data emerging from the P.falciparum genome sequencing consortium. To date, raw sequence coverage is available for >90% of the genome, and two chromosomes have been finished and annotated. Data in PlasmoDB are organized by chromosome (1–14), and can be accessed using a variety of tools for graphical and text-based browsing or downloaded in various file formats. The GUS (Genomics Unified Schema) implementation of PlasmoDB provides a multi-species genomic relational database, incorporating data from human and mouse, as well as P.falciparum. The relational schema uses a highly structured format to accommodate diverse data sets related to genomic sequence and gene expression. Tools have been designed to facilitate complex biological queries, including many that are specific to Plasmodium parasites and malaria as a disease. Additional projects seek to integrate genomic information with the rich data sets now becoming available for RNA transcription, protein expression, metabolic pathways, genetic and physical mapping, antigenic and population diversity, and phylogenetic relationships with other apicomplexan parasites. The overall goal of PlasmoDB is to facilitate Internet- and CD-ROM-based access to both finished and unfinished sequence information by the global malaria research community.
Resumo:
The release of vast quantities of DNA sequence data by large-scale genome and expressed sequence tag (EST) projects underlines the necessity for the development of efficient and inexpensive ways to link sequence databases with temporal and spatial expression profiles. Here we demonstrate the power of linking cDNA sequence data (including EST sequences) with transcript profiles revealed by cDNA-AFLP, a highly reproducible differential display method based on restriction enzyme digests and selective amplification under high stringency conditions. We have developed a computer program (GenEST) that predicts the sizes of virtual transcript-derived fragments (TDFs) of in silico-digested cDNA sequences retrieved from databases. The vast majority of the resulting virtual TDFs could be traced back among the thousands of TDFs displayed on cDNA-AFLP gels. Sequencing of the corresponding bands excised from cDNA-AFLP gels revealed no inconsistencies. As a consequence, cDNA sequence databases can be screened very efficiently to identify genes with relevant expression profiles. The other way round, it is possible to switch from cDNA-AFLP gels to sequences in the databases. Using the restriction enzyme recognition sites, the primer extensions and the estimated TDF size as identifiers, the DNA sequence(s) corresponding to a TDF with an interesting expression pattern can be identified. In this paper we show examples in both directions by analyzing the plant parasitic nematode Globodera rostochiensis. Various novel pathogenicity factors were identified by combining ESTs from the infective stage juveniles with expression profiles of ∼4000 genes in five developmental stages produced by cDNA-AFLP.
Resumo:
Of the approximately 380 families of angiosperms, representatives of only 10 are known to form symbiotic associations with nitrogen-fixing bacteria in root nodules. The morphologically based classification schemes proposed by taxonomists suggest that many of these 10 families of plants are only distantly related, engendering the hypothesis that the capacity to fix nitrogen evolved independently several, if not many, times. This has in turn influenced attitudes toward the likelihood of transferring genes responsible for symbiotic nitrogen fixation to crop species lacking this ability. Phylogenetic analysis of DNA sequences for the chloroplast gene rbcL indicates, however, that representatives of all 10 families with nitrogen-fixing symbioses occur together, with several families lacking this association, in a single clade. This study therefore indicates that only one lineage of closely related taxa achieved the underlying genetic architecture necessary for symbiotic nitrogen fixation in root nodules.
Resumo:
Insects in the order Plecoptera (stoneflies) use a form of two-dimensional aerodynamic locomotion called surface skimming to move across water surfaces. Because their weight is supported by water, skimmers can achieve effective aerodynamic locomotion even with small wings and weak flight muscles. These mechanical features stimulated the hypothesis that surface skimming may have been an intermediate stage in the evolution of insect flight, which has perhaps been retained in certain modern stoneflies. Here we present a phylogeny of Plecoptera based on nucleotide sequence data from the small subunit rRNA (18S) gene. By mapping locomotor behavior and wing structural data onto the phylogeny, we distinguish between the competing hypotheses that skimming is a retained ancestral trait or, alternatively, a relatively recent loss of flight. Our results show that basal stoneflies are surface skimmers, and that various forms of surface skimming are distributed widely across the plecopteran phylogeny. Stonefly wings show evolutionary trends in the number of cross veins and the thickness of the cuticle of the longitudinal veins that are consistent with elaboration and diversification of flight-related traits. These data support the hypothesis that the first stoneflies were surface skimmers, and that wing structures important for aerial flight have become elaborated and more diverse during the radiation of modern stoneflies.
Resumo:
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.
Resumo:
Woody Sonchus and five related genera (Babcockia, Taeckholmia, Sventenia, Lactucosonchus, and Prenanthes) of the Macaronesian islands have been regarded as an outstanding example of adaptive radiation in angiosperms. Internal transcribed spacer region of the nuclear rDNA (ITS) sequences were used to demonstrate that, despite the extensive morphological and ecological diversity of the plants, the entire alliance in insular Macaronesia has a common origin. The sequence data place Lactucosonchus as sister group to the remainder of the alliance and also indicate that four related genera are in turn sister groups to subg. Dendrosonchus and Taeckholmia. This implies that the woody members of Sonchus were derived from an ancestor similar to allied genera now present on the Canary Islands. It is also evident that the alliance probably occurred in the Canary Islands during the late Miocene or early Pliocene. A rapid radiation of major lineages in the alliance is consistent with an unresolved polytomy near the base and low ITS sequence divergence. Increase of woodiness is concordant with other insular endemics and refutes the relictural nature of woody Sonchus in the Macaronesian islands.
Resumo:
Amino acid sequencing by recombinant DNA technology, although dramatically useful, is subject to base reading errors, is indirect, and is insensitive to posttranslational processing. Mass spectrometry techniques can provide molecular weight data from even relatively large proteins for such cDNA sequences and can serve as a check of an enzyme's purity and sequence integrity. Multiply-charged ions from electrospray ionization can be dissociated to yield structural information by tandem mass spectrometry, providing a second method for gaining additional confidence in primary sequence confirmation. Here, accurate (+/- 1 Da) molecular weight and molecular ion dissociation information for human muscle and brain creatine kinases has been obtained by electrospray ionization coupled with Fourier-transform mass spectrometry to help distinguish which of several published amino acid sequences for both enzymes are correct. The results herein are consistent with one published sequence for each isozyme, and the heterogeneity indicated by isoelectric focusing due to 1-Da deamidation changes. This approach appears generally useful for detailed sequence verification of recombinant proteins.
Resumo:
The Mycetozoa include the cellular (dictyostelid), acellular (myxogastrid), and protostelid slime molds. However, available molecular data are in disagreement on both the monophyly and phylogenetic position of the group. Ribosomal RNA trees show the myxogastrid and dictyostelid slime molds as unrelated early branching lineages, but actin and β-tubulin trees place them together as a single coherent (monophyletic) group, closely related to the animal–fungal clade. We have sequenced the elongation factor-1α genes from one member of each division of the Mycetozoa, including Dictyostelium discoideum, for which cDNA sequences were previously available. Phylogenetic analyses of these sequences strongly support a monophyletic Mycetozoa, with the myxogastrid and dictyostelid slime molds most closely related to each other. All phylogenetic methods used also place this coherent Mycetozoan assemblage as emerging among the multicellular eukaryotes, tentatively supported as more closely related to animals + fungi than are green plants. With our data there are now three proteins that consistently support a monophyletic Mycetozoa and at least four that place these taxa within the “crown” of the eukaryote tree. We suggest that ribosomal RNA data should be more closely examined with regard to these questions, and we emphasize the importance of developing multiple sequence data sets.
Resumo:
Molecular, sequence-based environmental surveys of microorganisms have revealed a large degree of previously uncharacterized diversity. However, nearly all studies of the human endogenous bacterial flora have relied on cultivation and biochemical characterization of the resident organisms. We used molecular methods to characterize the breadth of bacterial diversity within the human subgingival crevice by comparing 264 small subunit rDNA sequences from 21 clone libraries created with products amplified directly from subgingival plaque, with sequences obtained from bacteria that were cultivated from the same specimen, as well as with sequences available in public databases. The majority (52.5%) of the directly amplified 16S rRNA sequences were <99% identical to sequences within public databases. In contrast, only 21.4% of the sequences recovered from cultivated bacteria showed this degree of variability. The 16S rDNA sequences recovered by direct amplification were also more deeply divergent; 13.5% of the amplified sequences were more than 5% nonidentical to any known sequence, a level of dissimilarity that is often found between members of different genera. None of the cultivated sequences exhibited this degree of sequence dissimilarity. Finally, direct amplification of 16S rDNA yielded a more diverse view of the subgingival bacterial flora than did cultivation. Our data suggest that a significant proportion of the resident human bacterial flora remain poorly characterized, even within this well studied and familiar microbial environment.