893 resultados para Database query languages
Resumo:
This paper addresses the problem of multilingual digital libraries. The motivation for a such a digital library comes from the diversity of languages of the Internet users as well as the diversity of content authors, from e-book authors to writers of courseware. The basic definitions of such a system, the specifications of its functionality and the identification of the items it holds are discussed. The impact of multilinguism in each of the former aspects is presented. A case study of a multilingual digital library - in the Maxwell System in PUC-Rio - is described in the last sections. Its main characteristics are described and the current status of its digital library is shown.
Resumo:
Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.
Resumo:
BACKGROUND: Many clinical studies are ultimately not fully published in peer-reviewed journals. Underreporting of clinical research is wasteful and can result in biased estimates of treatment effect or harm, leading to recommendations that are inappropriate or even dangerous. METHODS: We assembled a cohort of clinical studies approved 2000-2002 by the Research Ethics Committee of the University of Freiburg, Germany. Published full articles were searched in electronic databases and investigators contacted. Data on study characteristics were extracted from protocols and corresponding publications. We characterized the cohort, quantified its publication outcome and compared protocols and publications for selected aspects. RESULTS: Of 917 approved studies, 807 were started and 110 were not, either locally or as a whole. Of the started studies, 576 (71%) were completed according to protocol, 128 (16%) discontinued and 42 (5%) are still ongoing; for 61 (8%) there was no information about their course. We identified 782 full publications corresponding to 419 of the 807 initiated studies; the publication proportion was 52% (95% CI: 0.48-0.55). Study design was not significantly associated with subsequent publication. Multicentre status, international collaboration, large sample size and commercial or non-commercial funding were positively associated with subsequent publication. Commercial funding was mentioned in 203 (48%) protocols and in 205 (49%) of the publications. In most published studies (339; 81%) this information corresponded between protocol and publication. Most studies were published in English (367; 88%); some in German (25; 6%) or both languages (27; 6%). The local investigators were listed as (co-)authors in the publications corresponding to 259 (62%) studies. CONCLUSION: Half of the clinical research conducted at a large German university medical centre remains unpublished; future research is built on an incomplete database. Research resources are likely wasted as neither health care professionals nor patients nor policy makers can use the results when making decisions.
Resumo:
The broad aim of biomedical science in the postgenomic era is to link genomic and phenotype information to allow deeper understanding of the processes leading from genomic changes to altered phenotype and disease. The EuroPhenome project (http://www.EuroPhenome.org) is a comprehensive resource for raw and annotated high-throughput phenotyping data arising from projects such as EUMODIC. EUMODIC is gathering data from the EMPReSSslim pipeline (http://www.empress.har.mrc.ac.uk/) which is performed on inbred mouse strains and knock-out lines arising from the EUCOMM project. The EuroPhenome interface allows the user to access the data via the phenotype or genotype. It also allows the user to access the data in a variety of ways, including graphical display, statistical analysis and access to the raw data via web services. The raw phenotyping data captured in EuroPhenome is annotated by an annotation pipeline which automatically identifies statistically different mutants from the appropriate baseline and assigns ontology terms for that specific test. Mutant phenotypes can be quickly identified using two EuroPhenome tools: PhenoMap, a graphical representation of statistically relevant phenotypes, and mining for a mutant using ontology terms. To assist with data definition and cross-database comparisons, phenotype data is annotated using combinations of terms from biological ontologies.
Resumo:
Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.
Resumo:
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.
Resumo:
For well over 100 years, the Working Stress Design (WSD) approach has been the traditional basis for geotechnical design with regard to settlements or failure conditions. However, considerable effort has been put forth over the past couple of decades in relation to the adoption of the Load and Resistance Factor Design (LRFD) approach into geotechnical design. With the goal of producing engineered designs with consistent levels of reliability, the Federal Highway Administration (FHWA) issued a policy memorandum on June 28, 2000, requiring all new bridges initiated after October 1, 2007, to be designed according to the LRFD approach. Likewise, regionally calibrated LRFD resistance factors were permitted by the American Association of State Highway and Transportation Officials (AASHTO) to improve the economy of bridge foundation elements. Thus, projects TR-573, TR-583 and TR-584 were undertaken by a research team at Iowa State University’s Bridge Engineering Center with the goal of developing resistance factors for pile design using available pile static load test data. To accomplish this goal, the available data were first analyzed for reliability and then placed in a newly designed relational database management system termed PIle LOad Tests (PILOT), to which this first volume of the final report for project TR-573 is dedicated. PILOT is an amalgamated, electronic source of information consisting of both static and dynamic data for pile load tests conducted in the State of Iowa. The database, which includes historical data on pile load tests dating back to 1966, is intended for use in the establishment of LRFD resistance factors for design and construction control of driven pile foundations in Iowa. Although a considerable amount of geotechnical and pile load test data is available in literature as well as in various State Department of Transportation files, PILOT is one of the first regional databases to be exclusively used in the development of LRFD resistance factors for the design and construction control of driven pile foundations. Currently providing an electronically organized assimilation of geotechnical and pile load test data for 274 piles of various types (e.g., steel H-shaped, timber, pipe, Monotube, and concrete), PILOT (http://srg.cce.iastate.edu/lrfd/) is on par with such familiar national databases used in the calibration of LRFD resistance factors for pile foundations as the FHWA’s Deep Foundation Load Test Database. By narrowing geographical boundaries while maintaining a high number of pile load tests, PILOT exemplifies a model for effective regional LRFD calibration procedures.
Resumo:
Three pavement design software packages were compared with regards to how they were different in determining design input parameters and their influences on the pavement thickness. StreetPave designs the concrete pavement thickness based on the PCA method and the equivalent asphalt pavement thickness. The WinPAS software performs both concrete and asphalt pavements following the AASHTO 1993 design method. The APAI software designs asphalt pavements based on pre-mechanistic/empirical AASHTO methodology. First, the following four critical design input parameters were identified: traffic, subgrade strength, reliability, and design life. The sensitivity analysis of these four design input parameters were performed using three pavement design software packages to identify which input parameters require the most attention during pavement design. Based on the current pavement design procedures and sensitivity analysis results, a prototype pavement design and sensitivity analysis (PD&SA) software package was developed to retrieve the pavement thickness design value for a given condition and allow a user to perform a pavement design sensitivity analysis. The prototype PD&SA software is a computer program that stores pavement design results in database that is designed for the user to input design data from the variety of design programs and query design results for given conditions. The prototype Pavement Design and Sensitivity Analysis (PA&SA) software package was developed to demonstrate the concept of retrieving the pavement design results from the database for a design sensitivity analysis. This final report does not include the prototype software which will be validated and tested during the next phase.
Resumo:
In the context of recent attempts to redefine the 'skin notation' concept, a position paper summarizing an international workshop on the topic stated that the skin notation should be a hazard indicator related to the degree of toxicity and the potential for transdermal exposure of a chemical. Within the framework of developing a web-based tool integrating this concept, we constructed a database of 7101 agents for which a percutaneous permeation constant can be estimated (using molecular weight and octanol-water partition constant), and for which at least one of the following toxicity indices could be retrieved: Inhalation occupational exposure limit (n=644), Oral lethal dose 50 (LD50, n=6708), cutaneous LD50 (n=1801), Oral no observed adverse effect level (NOAEL, n=1600), and cutaneous NOAEL (n=187). Data sources included the Registry of toxic effects of chemical substances (RTECS, MDL information systems, Inc.), PHYSPROP (Syracuse Research Corp.) and safety cards from the International Programme on Chemical Safety (IPCS). A hazard index, which corresponds to the product of exposure duration and skin surface exposed that would yield an internal dose equal to a toxic reference dose was calculated. This presentation provides a descriptive summary of the database, correlations between toxicity indices, and an example of how the web tool will help industrial hygienist decide on the possibility of a dermal risk using the hazard index.
Resumo:
Abstract
Resumo:
This issue of the Catalan Journal of Linguistics was conceived with the idea to promote comparative studies of the languages spoken in the Iberian Peninsula. The importance of comparison in linguistics dates back to neogrammarians in the xix century due to their interest of discovering the common roots of most of the languages spoken in Europe. In order to get to that objective, comparison of phonological patterns were crucial to retrieve the common Indo-European origins
Resumo:
The Quaternary Active Faults Database of Iberia (QAFI) is an initiative lead by the Institute of Geology and Mines of Spain (IGME) for building a public repository of scientific data regarding faults having documented activity during the last 2.59 Ma (Quaternary). QAFI also addresses a need to transfer geologic knowledge to practitioners of seismic hazard and risk in Iberia by identifying and characterizing seismogenic fault-sources. QAFI is populated by the information freely provided by more than 40 Earth science researchers, storing to date a total of 262 records. In this article we describe the development and evolution of the database, as well as its internal architecture. Aditionally, a first global analysis of the data is provided with a special focus on length and slip-rate fault parameters. Finally, the database completeness and the internal consistency of the data are discussed. Even though QAFI v.2.0 is the most current resource for calculating fault-related seismic hazard in Iberia, the database is still incomplete and requires further review.
Resumo:
BACKGROUND: Several European HIV observational data bases have, over the last decade, accumulated a substantial number of resistance test results and developed large sample repositories, There is a need to link these efforts together, We here describe the development of such a novel tool that allows to bind these data bases together in a distributed fashion for which the control and data remains with the cohorts rather than classic data mergers.METHODS: As proof-of-concept we entered two basic queries into the tool: available resistance tests and available samples. We asked for patients still alive after 1998-01-01, and between 180 and 195 cm of height, and how many samples or resistance tests there would be available for these patients, The queries were uploaded with the tool to a central web server from which each participating cohort downloaded the queries with the tool and ran them against their database, The numbers gathered were then submitted back to the server and we could accumulate the number of available samples and resistance tests.RESULTS: We obtained the following results from the cohorts on available samples/resistance test: EuResist: not availableI11,194; EuroSIDA: 20,71611,992; ICONA: 3,751/500; Rega: 302/302; SHCS: 53,78311,485, In total, 78,552 samples and 15,473 resistance tests were available amongst these five cohorts. Once these data items have been identified, it is trivial to generate lists of relevant samples that would be usefuI for ultra deep sequencing in addition to the already available resistance tests, Saon the tool will include small analysis packages that allow each cohort to pull a report on their cohort profile and also survey emerging resistance trends in their own cohort,CONCLUSIONS: We plan on providing this tool to all cohorts within the Collaborative HIV and Anti-HIV Drug Resistance Network (CHAIN) and will provide the tool free of charge to others for any non-commercial use, The potential of this tool is to ease collaborations, that is, in projects requiring data to speed up identification of novel resistance mutations by increasing the number of observations across multiple cohorts instead of awaiting single cohorts or studies to reach the critical number needed to address such issues.