11 resultados para Distributed Databases
em National Center for Biotechnology Information - NCBI
Resumo:
To “control” a system is to make it behave (hopefully) according to our “wishes,” in a way compatible with safety and ethics, at the least possible cost. The systems considered here are distributed—i.e., governed (modeled) by partial differential equations (PDEs) of evolution. Our “wish” is to drive the system in a given time, by an adequate choice of the controls, from a given initial state to a final given state, which is the target. If this can be achieved (respectively, if we can reach any “neighborhood” of the target) the system, with the controls at our disposal, is exactly (respectively, approximately) controllable. A very general (and fuzzy) idea is that the more a system is “unstable” (chaotic, turbulent) the “simplest,” or the “cheapest,” it is to achieve exact or approximate controllability. When the PDEs are the Navier–Stokes equations, it leads to conjectures, which are presented and explained. Recent results, reported in this expository paper, essentially prove the conjectures in two space dimensions. In three space dimensions, a large number of new questions arise, some new results support (without proving) the conjectures, such as generic controllability and cases of decrease of cost of control when the instability increases. Short comments are made on models arising in climatology, thermoelasticity, non-Newtonian fluids, and molecular chemistry. The Introduction of the paper and the first part of all sections are not technical. Many open questions are mentioned in the text.
Resumo:
Multiple copies of the hexamer TGCATG have been shown to regulate fibronectin pre-mRNA alternative splicing. GCATG repeats also are clustered near the regulated calcitonin-specific 3′ splice site in the rat calcitonin/CGRP gene. Specific mutagenesis of these repeats in calcitonin/CGRP pre-mRNA resulted in the loss of calcitonin-specific splicing, suggesting that the native repeats act to enhance alternative exon inclusion. Mutation of subsets of these elements implies that alternative splicing requires a minimum of two repeats, and that the combination of one intronic and one exonic repeat is necessary for optimal cell-specific splicing. However, multimerized intronic repeats inhibited calcitonin-specific splicing in both the wild-type context and in a transcript lacking endogenous repeats. These results suggest that both the number and distribution of repeats may be important features for the regulation of tissue-specific alternative splicing. Further, RNA containing a single repeat bound cell-specific protein complexes, but tissue-specific differences in protein binding were not detected by using multimerized repeats. Together, these data support a novel model for alternative splicing regulation that requires the cell-specific recognition of multiple, distributed sequence elements.
Resumo:
The positions of ≈4,800 individual miniature inverted-repeat transposable element (MITE)-like repeats from four families were mapped on the Caenorhabditis elegans chromosomes. These families represent 1–2% of the total sequence of the organism. The four MITE families (Cele1, Cele2, Cele14, and Cele42) displayed distinct chromosomal distribution profiles. For example, the Cele14 MITEs were observed clustering near the ends of the autosomes. In contrast, the Cele2 MITEs displayed an even distribution through the central autosome domains, with no evidence for clustering at the ends. Both the number of elements and the distribution patterns of each family were conserved on all five C. elegans autosomes. The distribution profiles indicate chromosomal polarity and suggest that the current genetic and physical maps of chromosomes II, III, and X are inverted with respect to the other chromosomes. The degree of conservation of both the number and distribution of these elements on the five autosomes suggests a role in defining specific chromosomal domains.
Resumo:
Hyaluronan (HA), a large glycosaminoglycan abundant in the extracellular matrix, is important in cell migration during embryonic development, cellular proliferation, and differentiation and has a structural role in connective tissues. The turnover of HA requires endoglycosidic breakdown by lysosomal hyaluronidase, and a congenital deficiency of hyaluronidase has been thought to be incompatible with life. However, a patient with a deficiency of serum hyaluronidase, now designated as mucopolysaccharidosis IX, was recently described. This patient had a surprisingly mild clinical phenotype, including notable periarticular soft tissue masses, mild short stature, an absence of neurological or visceral involvement, and histological and ultrastructural evidence of a lysosomal storage disease. To determine the molecular basis of mucopolysaccharidosis IX, we analyzed two candidate genes tandemly distributed on human chromosome 3p21.3 and encoding proteins with homology to a sperm protein with hyaluronidase activity. These genes, HYAL1 and HYAL2, encode two distinct lysosomal hyaluronidases with different substrate specificities. We identified two mutations in the HYAL1 alleles of the patient, a 1412G → A mutation that introduces a nonconservative amino acid substitution (Glu268Lys) in a putative active site residue and a complex intragenic rearrangement, 1361del37ins14, that results in a premature termination codon. We further show that these two hyaluronidase genes, as well as a third recently discovered adjacent hyaluronidase gene, HYAL3, have markedly different tissue expression patterns, consistent with differing roles in HA metabolism. These data provide an explanation for the unexpectedly mild phenotype in mucopolysaccharidosis IX and predict the existence of other hyaluronidase deficiency disorders.
Resumo:
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.
Resumo:
The ARKdb genome databases provide comprehensive public repositories for genome mapping data from farmed species and other animals (http://www.thearkdb.org) providing a resource similar in function to that offered by GDB or MGD for human or mouse genome mapping data, respectively. Because we have attempted to build a generic mapping database, the system has wide utility, particularly for those species for which development of a specific resource would be prohibitive. The ARKdb genome database model has been implemented for 10 species to date. These are pig, chicken, sheep, cattle, horse, deer, tilapia, cat, turkey and salmon. Access to the ARKdb databases is effected via the World Wide Web using the ARKdb browser and Anubis map viewer. The information stored includes details of loci, maps, experimental methods and the source references. Links to other information sources such as PubMed and EMBL/GenBank are provided. Responsibility for data entry and curation is shared amongst scientists active in genome research in the species of interest. Mirror sites in the United States are maintained in addition to the central genome server at Roslin.
Resumo:
There is no control over the information provided with sequences when they are deposited in the sequence databases. Consequently mistakes can seed the incorrect annotation of other sequences. Grouping genes into families and applying controlled annotation overcomes the problems of incorrect annotation associated with individual sequences. Two databases (http://www.mendel.ac.uk) were created to apply controlled annotation to plant genes and plant ESTs: Mendel-GFDb is a database of plant protein (gene) families based on gapped-BLAST analysis of all sequences in the SWISS-PROT family of databases. Sequences are aligned (ClustalW) and identical and similar residues shaded. The families are visually curated to ensure that one or more criteria, for example overall relatedness and/or domain similarity relate all sequences within a family. Sequence families are assigned a ‘Gene Family Number’ and a unified description is developed which best describes the family and its members. If authority exists the gene family is assigned a ‘Gene Family Name’. This information is placed in Mendel-GFDb. Mendel-ESTS is primarily a database of plant ESTs, which have been compared to Mendel-GFDb, completely sequenced genomes and domain databases. This approach associated ESTs with individual sequences and the controlled annotation of gene families and protein domains; the information being placed in Mendel-ESTS. The controlled annotation applied to genes and ESTs provides a basis from which a plant transcription database can be developed.
Resumo:
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).
Resumo:
The corticotropin-releasing factor (CRF) family of neuropeptides includes the mammalian peptides CRF, urocortin, and urocortin II, as well as piscine urotensin I and frog sauvagine. The mammalian peptides signal through two G protein-coupled receptor types to modulate endocrine, autonomic, and behavioral responses to stress, as well as a range of peripheral (cardiovascular, gastrointestinal, and immune) activities. The three previously known ligands are differentially distributed anatomically and have distinct specificities for the two major receptor types. Here we describe the characterization of an additional CRF-related peptide, urocortin III, in the human and mouse. In searching the public human genome databases we found a partial expressed sequence tagged (EST) clone with significant sequence identity to mammalian and fish urocortin-related peptides. By using primers based on the human EST sequence, a full-length human clone was isolated from genomic DNA that encodes a protein that includes a predicted putative 38-aa peptide structurally related to other known family members. With a human probe, we then cloned the mouse ortholog from a genomic library. Human and mouse urocortin III share 90% identity in the 38-aa putative mature peptide. In the peptide coding region, both human and mouse urocortin III are 76% identical to pufferfish urocortin-related peptide and more distantly related to urocortin II, CRF, and urocortin from other mammalian species. Mouse urocortin III mRNA expression is found in areas of the brain including the hypothalamus, amygdala, and brainstem, but is not evident in the cerebellum, pituitary, or cerebral cortex; it is also expressed peripherally in small intestine and skin. Urocortin III is selective for type 2 CRF receptors and thus represents another potential endogenous ligand for these receptors.
Resumo:
The Internet has created new opportunities for librarians to present literature search results to clinicians. In order to take full advantage of these opportunities, libraries need to create locally maintained bibliographic databases. A simple method of creating a local bibliographic database and publishing it on the Web is described. The method uses off-the-shelf software and requires minimal programming. A hedge search strategy for outcome studies of clinical process interventions is created, and Ovid is used to search MEDLINE. The search results are saved and imported into EndNote libraries. The citations are modified, exported to a Microsoft Access database, and published on the Web. Clinicians can use a Web browser to search the database. The bibliographic database contains 13,803 MEDLINE citations of outcome studies. Most searches take between four and ten seconds and retrieve between ten and 100 citations. The entire cost of the software is under $900. Locally maintained bibliographic databases can be created easily and inexpensively. They significantly extend the evidence-based health care services that libraries can offer to clinicians.
Resumo:
Structurally neighboring residues are categorized according to their separation in the primary sequence as proximal (1-4 positions apart) and otherwise distal, which in turn is divided into near (5-20 positions), far (21-50 positions), very far ( > 50 positions), and interchain (from different chains of the same structure). These categories describe the linear distance histogram (LDH) for three-dimensional neighboring residue types. Among the main results are the following: (i) nearest-neighbor hydrophobic residues tend to be increasingly distally separated in the linear sequence, thus most often connecting distinct secondary structure units. (ii) The LDHs of oppositely charged nearest-neighbors emphasize proximal positions with a subsidiary maximum for very far positions. (iii) Cysteine-cysteine structural interactions rarely involve proximal positions. (iv) The greatest numbers of interchain specific nearest-neighbors in protein structures are composed of oppositely charged residues. (v) The largest fraction of side-chain neighboring residues from beta-strands involves near positions, emphasizing associations between consecutive strands. (vi) Exposed residue pairs are predominantly located in proximal linear positions, while buried residue pairs principally correspond to far or very far distal positions. The results are principally invariant to protein sizes, amino acid usages, linear distance normalizations, and over- and underrepresentations among nearest-neighbor types. Interpretations and hypotheses concerning the LDHs, particularly those of hydrophobic and charged pairings, are discussed with respect to protein stability and functionality. The pronounced occurrence of oppositely charged interchain contacts is consistent with many observations on protein complexes where multichain stabilization is facilitated by electrostatic interactions.