992 resultados para Database application
Resumo:
The present data set provides a tab separated text file compressed in a zip archive. The file includes metadata for each TaraOceans V9 rDNA metabarcode including the following fields: md5sum = unique identifier; lineage = taxonomic path associated to the metabarcode; pid = % identity to the closest reference barcode from V9_PR2; sequence = nucleotide sequence of the metabarcode; refs = identity of the best hit reference sequence(s); TARA_xxx = number of occurrences of this barcode in each of the 334 samples; totab = total abundance of the barcode ; cid = identifier of the OTU to which the barcode belongs; and taxogroup = high-taxonomic level assignation of this barcode. The file also includes three categories of functional annotations: (1) Chloroplast: yes, presence of permanent chloroplast; no, absence of permanent chloroplast ; NA, undetermined. (2) Symbiont (small partner): parasite, the species is a parasite; commensal, the species is a commensal; mutualist, the species is a mutualist symbiont, most often a microalgal taxon involved in photosymbiosis; no the species is not involved in a symbiosis as small partner; NA, undetermined. (3) Symbiont (host): photo, the host species relies on a mutualistic microalgal photosymbiont to survive (obligatory photosymbiosis); photo_falc, same as photo, but facultative relationship; photo_klep, the host species maintains chloroplasts from microalgal prey(s) to survive; photo_klep_falc, same as photo_klep, but facultative; Nfix, the host species must interact with a mutualistic symbiont providing N2 fixation to survive; Nfix_falc, same as Nfix, but facultative; no, the species is not involved in any mutualistic symbioses; NA, undetermined. For example, the collodarian/Brandtodinium symbiosis is annotated: Chloroplast, "no"; Symbiont (small), "no"; Symbiont (host), "photo", for the collodarian host; and: Chloroplast, "yes"; Symbiont (small), "mutualist"; Symbiont (host), "no", for the dinoflagellate microalgal endosymbiont.chloroplast = "yes", "no" or "NA"; symbiont.small = "parasite", "commensal", "mutualist", "no" or "NA"; symbiont.host = "photo", "photo_falc", "photo_klep", "Nfix", no or NA; benef = "Nfix", "no" or "NA"; trophism = Metazoa , heterotroph , NA , photosymbiosis , phototroph according to the previous fields.
Resumo:
This is a 20-year long database of GPS data collected by geodetic surveys carried out over the seismically and volcanically active eastern Sicily, for a total of more than 6300 measurements. Data have been convertedi nto the international ASCII compressed RINEX standard in order to be imported and processed by any GPS analysis software. Database is provided with an explorer software for navigating into the dataset by spatial (GIS) and temporal queries.
Resumo:
Narcolepsy with cataplexy is a rare disease with an estimated prevalence of 0.02% in European populations. Narcolepsy shares many features of rare disorders, in particular the lack of awareness of the disease with serious consequences for healthcare supply. Similar to other rare diseases, only a few European countries have registered narcolepsy cases in databases of the International Classification of Diseases or in registries of the European health authorities. A promising approach to identify disease-specific adverse health effects and needs in healthcare delivery in the field of rare diseases is to establish a distributed expert network. A first and important step is to create a database that allows collection, storage and dissemination of data on narcolepsy in a comprehensive and systematic way. Here, the first prospective web-based European narcolepsy database hosted by the European Narcolepsy Network is introduced. The database structure, standardization of data acquisition and quality control procedures are described, and an overview provided of the first 1079 patients from 18 European specialized centres. Due to its standardization this continuously increasing data pool is most promising to provide a better insight into many unsolved aspects of narcolepsy and related disorders, including clear phenotype characterization of subtypes of narcolepsy, more precise epidemiological data and knowledge on the natural history of narcolepsy, expectations about treatment effects, identification of post-marketing medication side-effects, and will contribute to improve clinical trial designs and provide facilities to further develop phase III trials.
Resumo:
The central composite rotatable design (CCRD) was used to design an experimental program to model the effects of inlet pressure, feed density, and length and diameter of the inner vortex finder on the operational performance of a 150-min three-product cyclone. The ranges of values of the variables used in the design were: inlet pressure: 80-130 kPa, feed density: 30 60%; length of IVF below the OVF: 50-585 mm; diameter of IVF: 35-50 mm. A total of 30 tests were conducted, which is 51 less; an that required for a three-level full factorial design. Because the model allows confident performance prediction by interpolation over the range of data in the database, it was used to construct response surface graphs to describe the effects of the variables on the performance of the three-product cyclone. To obtain a simple and yet a realistic model, it was refitted using only the variable terms that are significant at greater than or equal to 90% confidence level. Considering the selected operating variables, the resultant model is significant and predicts the experimental data well. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Wind tunnel measurements of drop Size distributions from Micronair A U4000 and A U5000 rotary atomizers were collected to develop a database for model use. The measurements varied tank mix, flow rate, air speed, and blade angle conditions, which were correlated by multiple regressions (average R-2 = 0.995 for A U4000 and 0.988 for AU5000). This database replaces an outdated set of rotary atomizer data measured in the 1980s by the USDA Forest Service and fills in a gap in data measured in the 1990s by the Spray Drift Task Force. Since current USDA Forest Service spray projects rely on rotary atomizers, the creation of the database (and its multiple regression interpolation) satisfies a need seen for ten years.
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
The continuing threat of infectious disease and future pandemics, coupled to the continuous increase of drug-resistant pathogens, makes the discovery of new and better vaccines imperative. For effective vaccine development, antigen discovery and validation is a prerequisite. The compilation of information concerning pathogens, virulence factors and antigenic epitopes has resulted in many useful databases. However, most such immunological databases focus almost exclusively on antigens where epitopes are known and ignore those for which epitope information was unavailable. We have compiled more than 500 antigens into the AntigenDB database, making use of the literature and other immunological resources. These antigens come from 44 important pathogenic species. In AntigenDB, a database entry contains information regarding the sequence, structure, origin, etc. of an antigen with additional information such as B and T-cell epitopes, MHC binding, function, gene-expression and post translational modifications, where available. AntigenDB also provides links to major internal and external databases. We shall update AntigenDB on a rolling basis, regularly adding antigens from other organisms and extra data analysis tools. AntigenDB is available freely at http://www.imtech.res.in/raghava/antigendb and its mirror site http://www.bic.uams.edu/raghava/antigendb.
Resumo:
Computer-Based Learning systems of one sort or another have been in existence for almost 20 years, but they have yet to achieve real credibility within Commerce, Industry or Education. A variety of reasons could be postulated for this, typically: - cost - complexity - inefficiency - inflexibility - tedium Obviously different systems deserve different levels and types of criticism, but it still remains true that Computer-Based Learning (CBL) is falling significantly short of its potential. Experience of a small, but highly successful CBL system within a large, geographically distributed industry (the National Coal Board) prompted an investigation into currently available packages, the original intention being to purchase the most suitable software and run it on existing computer hardware, alongside existing software systems. It became apparent that none of the available CBL packages were suitable, and a decision was taken to develop an in-house Computer-Assisted Instruction system according to the following criteria: - cheap to run; - easy to author course material; - easy to use; - requires no computing knowledge to use (as either an author or student) ; - efficient in the use of computer resources; - has a comprehensive range of facilities at all levels. This thesis describes the initial investigation, resultant observations and the design, development and implementation of the SCHOOL system. One of the principal characteristics c£ SCHOOL is that it uses a hierarchical database structure for the storage of course material - thereby providing inherently a great deal of the power, flexibility and efficiency originally required. Trials using the SCHOOL system on IBM 303X series equipment are also detailed, along with proposed and current development work on what is essentially an operational CBL system within a large-scale Industrial environment.
Resumo:
Database systems have a user interface one of the components of which will normally be a query language which is based on a particular data model. Typically data models provide primitives to define, manipulate and query databases. Often these primitives are designed to form self-contained query languages. This thesis describes a prototype implementation of a system which allows users to specify queries against the database in a query language whose primitives are not those provided by the actual model on which the database system is based, but those provided by a different data model. The implementation chosen is the Functional Query Language Front End (FQLFE). This uses the Daplex functional data model and query language. Using FQLFE, users can specify the underlying database (based on the relational model) in terms of Daplex. Queries against this specified view can then be made in Daplex. FQLFE transforms these queries into the query language (Quel) of the underlying target database system (Ingres). The automation of part of the Daplex function definition phase is also described and its implementation discussed.
Resumo:
An implementation of a Lexical Functional Grammar (LFG) natural language front-end to a database is presented, and its capabilities demonstrated by reference to a set of queries used in the Chat-80 system. The potential of LFG for such applications is explored. Other grammars previously used for this purpose are briefly reviewed and contrasted with LFG. The basic LFG formalism is fully described, both as to its syntax and semantics, and the deficiencies of the latter for database access application shown. Other current LFG implementations are reviewed and contrasted with the LFG implementation developed here specifically for database access. The implementation described here allows a natural language interface to a specific Prolog database to be produced from a set of grammar rule and lexical specifications in an LFG-like notation. In addition to this the interface system uses a simple database description to compile metadata about the database for later use in planning the execution of queries. Extensions to LFG's semantic component are shown to be necessary to produce a satisfactory functional analysis and semantic output for querying a database. A diverse set of natural language constructs are analysed using LFG and the derivation of Prolog queries from the F-structure output of LFG is illustrated. The functional description produced from LFG is proposed as sufficient for resolving many problems of quantification and attachment.
Resumo:
This thesis presents a new approach to designing large organizational databases. The approach emphasizes the need for a holistic approach to the design process. The development of the proposed approach was based on a comprehensive examination of the issues of relevance to the design and utilization of databases. Such issues include conceptual modelling, organization theory, and semantic theory. The conceptual modelling approach presented in this thesis is developed over three design stages, or model perspectives. In the semantic perspective, concept definitions were developed based on established semantic principles. Such definitions rely on meaning - provided by intension and extension - to determine intrinsic conceptual definitions. A tool, called meaning-based classification (MBC), is devised to classify concepts based on meaning. Concept classes are then integrated using concept definitions and a set of semantic relations which rely on concept content and form. In the application perspective, relationships are semantically defined according to the application environment. Relationship definitions include explicit relationship properties and constraints. The organization perspective introduces a new set of relations specifically developed to maintain conformity of conceptual abstractions with the nature of information abstractions implied by user requirements throughout the organization. Such relations are based on the stratification of work hierarchies, defined elsewhere in the thesis. Finally, an example of an application of the proposed approach is presented to illustrate the applicability and practicality of the modelling approach.