66 resultados para Whole Genome Sequences


Relevância:

30.00% 30.00%

Publicador:

Resumo:

While genome sequencing projects are advancing rapidly, EST sequencing and analysis remains a primary research tool for the identification and categorization of gene sequences in a wide variety of species and an important resource for annotation of genomic sequence. The TIGR Gene Indices (http://www.tigr.org/tdb/tgi.shtml) are a collection of species-specific databases that use a highly refined protocol to analyze EST sequences in an attempt to identify the genes represented by that data and to provide additional information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST and annotated gene sequences from GenBank for the targeted species. This process produces a set of unique, high-fidelity virtual transcripts, or Tentative Consensus (TC) sequences. The TC sequences can be used to provide putative genes with functional annotation, to link the transcripts to mapping and genomic sequence data, to provide links between orthologous and paralogous genes and as a resource for comparative sequence analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Medicago Genome Initiative (MGI) is a database of EST sequences of the model legume Medicago truncatula. The database is available to the public and has resulted from a collaborative research effort between the Samuel Roberts Noble Foundation and the National Center for Genome Resources to investigate the genome of M.truncatula. MGI is part of the greater integrated Medicago functional genomics program at the Noble Foundation (http://www.noble .org), which is taking a global approach in studying the genetic and biochemical events associated with the growth, development and environmental interactions of this model legume. Our approach will include: large-scale EST sequencing, gene expression profiling, the generation of M.truncatula activation-tagged and promoter trap insertion mutants, high-throughput metabolic profiling, and proteome studies. These multidisciplinary information pools will be interfaced with one another to provide scientists with an integrated, holistic set of tools to address fundamental questions pertaining to legume biology. The public interface to the MGI database can be accessed at http://www.ncgr.org/research/mgi.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Homeodomain Resource is an annotated collection of non-redundant protein sequences, three-dimensional structures and genomic information for the homeodomain protein family. Release 3.0 contains 795 full-length homeodomain-containing sequences, 32 experimentally-derived structures and 143 homeo­box loci implicated in human genetic disorders. Entries are fully hyperlinked to facilitate easy retrieval of the original records from source databases. A simple search engine with a graphical user interface is provided to query the component databases and assemble customized data sets. A new feature for this release is the addition of DNA recognition sites for all human homeodomain proteins described in the literature. The Homeodomain Resource is freely available through the World Wide Web at http://genome.nhgri.nih.gov/homeodomain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The product of the herpes simplex virus type 1 UL28 gene is essential for cleavage of concatemeric viral DNA into genome-length units and packaging of this DNA into viral procapsids. To address the role of UL28 in this process, purified UL28 protein was assayed for the ability to recognize conserved herpesvirus DNA packaging sequences. We report that DNA fragments containing the pac1 DNA packaging motif can be induced by heat treatment to adopt novel DNA conformations that migrate faster than the corresponding duplex in nondenaturing gels. Surprisingly, these novel DNA structures are high-affinity substrates for UL28 protein binding, whereas double-stranded DNA of identical sequence composition is not recognized by UL28 protein. We demonstrate that only one strand of the pac1 motif is responsible for the formation of novel DNA structures that are bound tightly and specifically by UL28 protein. To determine the relevance of the observed UL28 protein–pac1 interaction to the cleavage and packaging process, we have analyzed the binding affinity of UL28 protein for pac1 mutants previously shown to be deficient in cleavage and packaging in vivo. Each of the pac1 mutants exhibited a decrease in DNA binding by UL28 protein that correlated directly with the reported reduction in cleavage and packaging efficiency, thereby supporting a role for the UL28 protein–pac1 interaction in vivo. These data therefore suggest that the formation of novel DNA structures by the pac1 motif confers added specificity on recognition of DNA packaging sequences by the UL28-encoded component of the herpesvirus cleavage and packaging machinery.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The evolution of novelty in tightly integrated biological systems, such as hormones and their receptors, seems to challenge the theory of natural selection: it has not been clear how a new function for any one part (such as a ligand) can be selected for unless the other members of the system (e.g., a receptor) are already present. Here I show—based on identification and phylogenetic analysis of steroid receptors in basal vertebrates and reconstruction of the sequences and functional attributes of ancestral proteins—that the first steroid receptor was an estrogen receptor, followed by a progesterone receptor. Genome mapping and phylogenetic analyses indicate that the full complement of mammalian steroid receptors evolved from these ancient receptors by two large-scale genome expansions, one before the advent of jawed vertebrates and one after. Specific regulation of physiological processes by androgens and corticoids are relatively recent innovations that emerged after these duplications. These findings support a model of ligand exploitation in which the terminal ligand in a biosynthetic pathway is the first for which a receptor evolves; selection for this hormone also selects for the synthesis of intermediates despite the absence of receptors, and duplicated receptors then evolve affinity for these substances. In this way, novel hormone-receptor pairs are created, and an integrated system of increasing complexity elaborated. This model suggests that ligands for some “orphan” receptors may be found among intermediates in the synthesis of ligands for phylogenetically related receptors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present paper summarizes future needs in information and tools, technology, infrastructure, training, funding, and bioinformatics, to provide the genomic knowledge and tools for breeding and biotechnological goals in maize. The National Corn Genome Initiative (NCGA) has developed through actions taken by the National Corn Growers Association (NCGA) and participation in a planning process by institutions, companies, and organizations. At the web address for the NCGI, http://www.inverizon.com/ncgi, are detailed analyses of goals and costs, impact and value, and strategy and approaches. The NCGI has also produced an informative and perceptive video suitable for public groups or schools, about agricultural contributions to life and the place of maize in these contributions. High potential can be expected, from cross-application of knowledge obtained in maize and other cereals. Development of information and tools for all crops, whether monocots or dicots, will be gained through an initiative, and each crop will be positioned to advance with cost-effective parallels, especially for expressed sequences, markers, and physical mapping.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Simple phylogenetic tests were applied to a large data set of nucleotide sequences from two nuclear genes and a region of the mitochondrial genome of Trypanosoma cruzi, the agent of Chagas' disease. Incongruent gene genealogies manifest genetic exchange among distantly related lineages of T. cruzi. Two widely distributed isoenzyme types of T. cruzi are hybrids, their genetic composition being the likely result of genetic exchange between two distantly related lineages. The data show that the reference strain for the T. cruzi genome project (CL Brener) is a hybrid. Well-supported gene genealogies show that mitochondrial and nuclear gene sequences from T. cruzi cluster, respectively, in three or four distinct clades that do not fully correspond to the two previously defined major lineages of T. cruzi. There is clear genetic differentiation among the major groups of sequences, but genetic diversity within each major group is low. We estimate that the major extant lineages of T. cruzi have diverged during the Miocene or early Pliocene (3–16 million years ago).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a system for generation of infectious bursal disease virus (IBDV), a segmented double-stranded RNA virus of the Birnaviridae family, with the use of synthetic transcripts derived from cloned cDNA. Independent full-length cDNA clones were constructed that contained the entire coding and noncoding regions of RNA segments A and B of two distinguishable IBDV strains of serotype I. Segment A encodes all of the structural (VP2, VP4, and VP3) and nonstructural (VP5) proteins, whereas segment B encodes the RNA-dependent RNA polymerase (VP1). Synthetic RNAs of both segments were produced by in vitro transcription of linearized plasmids with T7 RNA polymerase. Transfection of Vero cells with combined plus-sense transcripts of both segments generated infectious virus as early as 36 hr after transfection. The infectivity and specificity of the recovered chimeric virus was ascertained by the appearance of cytopathic effect in chicken embryo cells, by immunofluorescence staining of infected Vero cells with rabbit anti-IBDV serum, and by nucleotide sequence analysis of the recovered virus, respectively. In addition, transfectant viruses containing genetically tagged sequences in either segment A or segment B of IBDV were generated to confirm the feasibility of this system. The development of a reverse genetics system for double-stranded RNA viruses will greatly facilitate studies of the regulation of viral gene expression, pathogenesis, and design of a new generation of live vaccines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Microsatellites are tandem repeat sequences abundant in the genomes of higher eukaryotes and hitherto considered as "junk DNA." Analysis of a human genome representative data base (2.84 Mb) reveals a distinct juxtaposition of A-rich microsatellites and retroposons and suggests their coevolution. The analysis implies that most microsatellites were generated by a 3'-extension of retrotranscripts, similar to mRNA polyadenylylation, and that they serve in turn as "retroposition navigators," directing the retroposons via homology-driven integration into defined sites. Thus, they became instrumental in the preservation and extension of primordial genomic patterns. A role is assigned to these reiterating A-rich loci in the higher-order organization of the chromatin. The disease-associated triplet repeats are mostly found in coding regions and do not show an association with retroposons, constituting a unique set within the family of microsatellite sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human endogenous retroviruses (HERVs) are very likely footprints of ancient germ-cell infections. HERV sequences encompass about 1% of the human genome. HERVs have retained the potential of other retroelements to retrotranspose and thus to change genomic structure and function. The genomes of almost all HERV families are highly defective. Recent progress has allowed the identification of the biologically most active family, HTDV/HERV-K, which codes for viral proteins and particles and is highly expressed in germ-cell tumors. The demonstrable and potential roles of HTDV/HERV-K as well as of other human elements in disease and in maintaining genome plasticity are illustrated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A DNA sequence, TPE1, representing the internal domain of a Ty1-copia retroelement, was isolated from genomic DNA of Pinus elliottii Engelm. var. elliottii (slash pine). Genomic Southern analysis showed that this sequence, carrying partial reverse transcriptase and integrase gene sequences, is highly amplified within the genome of slash pine and part of a dispersed element >4.8 kbp. Fluorescent in situ hybridization to metaphase chromosomes shows that the element is relatively uniformly dispersed over all 12 chromosome pairs and is highly abundant in the genome. It is largely excluded from centromeric regions and intercalary chromosomal sites representing the 18S-5.8S-25S rRNA genes. Southern hybridization with specific DNA probes for the reverse transcriptase gene shows that TPE1 represents a large subgroup of heterogeneous Ty1-copia retrotransposons in Pinus species. Because no TPE1 transcription could be detected, it is most likely an inactive element--at least in needle tissue. Further evidence for inactivity was found in recombinant reverse transcriptase and integrase sequences. The distribution of TPE1 within different gymnosperms that contain Ty1-copia group retrotransposons, as shown by a PCR assay, was investigated by Southern hybridization. The TPE1 family is highly amplified and conserved in all Pinus species analyzed, showing a similar genomic organization in the three- and five-needle pine species investigated. It is also present in spruce, bald cypress (swamp cypress), and in gingko but in fewer copies and a different genomic organization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An integrated map of the genome of the tubercle bacillus, Mycobacterium tuberculosis, was constructed by using a twin-pronged approach. Pulsed-field gel electrophoretic analysis enabled cleavage sites for Asn I and Dra I to be positioned on the 4.4-Mb circular chromosome, while, in parallel, clones from two cosmid libraries were ordered into contigs by means of fingerprinting and hybridization mapping. The resultant contig map was readily correlated with the physical map of the genome via the landmarked restriction sites. Over 165 genes and markers were localized on the integrated map, thus enabling comparisons with the leprosy bacillus, Mycobacterium leprae, to be undertaken. Mycobacterial genomes appear to have evolved as mosaic structures since extended segments with conserved gene order and organization are interspersed with different flanking regions. Repetitive sequences and insertion elements are highly abundant in M. tuberculosis, but the distribution of IS6110 is apparently nonrandom.