884 resultados para Incomplete relational database
Resumo:
The World Wide Web (WWW) is useful for distributing scientific data. Most existing web data resources organize their information either in structured flat files or relational databases with basic retrieval capabilities. For databases with one or a few simple relations, these approaches are successful, but they can be cumbersome when there is a data model involving multiple relations between complex data. We believe that knowledge-based resources offer a solution in these cases. Knowledge bases have explicit declarations of the concepts in the domain, along with the relations between them. They are usually organized hierarchically, and provide a global data model with a controlled vocabulary, We have created the OWEB architecture for building online scientific data resources using knowledge bases. OWEB provides a shell for structuring data, providing secure and shared access, and creating computational modules for processing and displaying data. In this paper, we describe the translation of the online immunological database MHCPEP into an OWEB system called MHCWeb. This effort involved building a conceptual model for the data, creating a controlled terminology for the legal values for different types of data, and then translating the original data into the new structure. The 0 WEB environment allows for flexible access to the data by both users and computer programs.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Nowadays, the vulgarization of information and communication technologies has reached to a level that the majority of people spend a lot of time using software to do regular tasks, ranging from games and ordinary time and weather utilities to some more sophisticated ones, like retail or banking applications. This new way of life is supported by the Internet or by specific applications that changed the image people had about using information and communication technologies. All over the world, the first cycle of studies of educational systems also has been addressed with the justification that this encourages the development of children. Taking this into consideration, we design and develop a visual explorer system for relational databases that can be used by everyone, from “7 to 77”, in an intuitive and easy way, getting immediate results – a new database querying experience. Thus, in this paper we will expose the main characteristics and features of this visual database explorer, showing how it works and how it can be used to execute the most current data manipulation operations over a database.
Resumo:
BACKGROUND: Soft tissue sarcomas of the trunk wall (STS-TW) are usually studied together with soft tissue sarcomas of other locations. We report a study on STS-TW forming part of the French Sarcoma Group database. PATIENTS AND METHODS: Three hundred and forty-three adults were included. We carried out univariate and multivariate analysis for overall survival (OS), metastasis-free survival (MFS) and local recurrence-free survival (LRFS). RESULTS: Tumor locations were as follows: thoracic wall, 82.5%; abdominal wall, 12.3% and pelvic wall, 5.2%. Median tumor size was 6.0 cm. The most frequent tumor types were unclassified sarcoma (27.7%) and myogenic sarcoma (19.2%). A total of 44.6% of cases were grade 3. In all, 21.9% of patients had a previous medical history of radiotherapy (PHR). Median follow-up was 7.6 years. The 5-year OS, MFS and LRFS rates were 60.4%, 68.9% and 58.4%, respectively. Multivariate analysis retained PHR and grade for predicting LRFS and PHR, size and grade as prognostic factors of MFS. Factors influencing OS were age, size, PHR, depth, grade and surgical margins. The predictive factors of incomplete response were PHR, size and T3. CONCLUSIONS: Our results suggest similar classical prognostic factors as compared with sarcomas of other locations. However, a separate analysis of STS-TW revealed a significant poor prognosis subgroup of patients with PHR.
Resumo:
The Quaternary Active Faults Database of Iberia (QAFI) is an initiative lead by the Institute of Geology and Mines of Spain (IGME) for building a public repository of scientific data regarding faults having documented activity during the last 2.59 Ma (Quaternary). QAFI also addresses a need to transfer geologic knowledge to practitioners of seismic hazard and risk in Iberia by identifying and characterizing seismogenic fault-sources. QAFI is populated by the information freely provided by more than 40 Earth science researchers, storing to date a total of 262 records. In this article we describe the development and evolution of the database, as well as its internal architecture. Aditionally, a first global analysis of the data is provided with a special focus on length and slip-rate fault parameters. Finally, the database completeness and the internal consistency of the data are discussed. Even though QAFI v.2.0 is the most current resource for calculating fault-related seismic hazard in Iberia, the database is still incomplete and requires further review.
Resumo:
Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys. Database URL: http://www.phenol-explorer.eu
Resumo:
Classical relational databases lack proper ways to manage certain real-world situations including imprecise or uncertain data. Fuzzy databases overcome this limitation by allowing each entry in the table to be a fuzzy set where each element of the corresponding domain is assigned a membership degree from the real interval [0…1]. But this fuzzy mechanism becomes inappropriate in modelling scenarios where data might be incomparable. Therefore, we become interested in further generalization of fuzzy database into L-fuzzy database. In such a database, the characteristic function for a fuzzy set maps to an arbitrary complete Brouwerian lattice L. From the query language perspectives, the language of fuzzy database, FSQL extends the regular Structured Query Language (SQL) by adding fuzzy specific constructions. In addition to that, L-fuzzy query language LFSQL introduces appropriate linguistic operations to define and manipulate inexact data in an L-fuzzy database. This research mainly focuses on defining the semantics of LFSQL. However, it requires an abstract algebraic theory which can be used to prove all the properties of, and operations on, L-fuzzy relations. In our study, we show that the theory of arrow categories forms a suitable framework for that. Therefore, we define the semantics of LFSQL in the abstract notion of an arrow category. In addition, we implement the operations of L-fuzzy relations in Haskell and develop a parser that translates algebraic expressions into our implementation.
Resumo:
Presentation at the 1997 Dagstuhl Seminar "Evaluation of Multimedia Information Retrieval", Norbert Fuhr, Keith van Rijsbergen, Alan F. Smeaton (eds.), Dagstuhl Seminar Report 175, 14.04. - 18.04.97 (9716). - Abstract: This presentation will introduce ESCHER, a database editor which supports visualization in non-standard applications in engineering, science, tourism and the entertainment industry. It was originally based on the extended nested relational data model and is currently extended to include object-relational properties like inheritance, object types, integrity constraints and methods. It serves as a research platform into areas such as multimedia and visual information systems, QBE-like queries, computer-supported concurrent work (CSCW) and novel storage techniques. In its role as a Visual Information System, a database editor must support browsing and navigation. ESCHER provides this access to data by means of so called fingers. They generalize the cursor paradigm in graphical and text editors. On the graphical display, a finger is reflected by a colored area which corresponds to the object a finger is currently pointing at. In a table more than one finger may point to objects, one of which is the active finger and is used for navigating through the table. The talk will mostly concentrate on giving examples for this type of navigation and will discuss some of the architectural needs for fast object traversal and display. ESCHER is available as public domain software from our ftp site in Kassel. The portable C source can be easily compiled for any machine running UNIX and OSF/Motif, in particular our working environments IBM RS/6000 and Intel-based LINUX systems. A porting to Tcl/Tk is under way.
Resumo:
Some examples from the book. Connolly, T. M. and C. E. Begg (2005). Database systems : a practical approach to design, implementation, and management. Harlow, Essex, England ; New York, Addison-Wesley.
Resumo:
L'objectiu d'aquest article és presentar l'estructura de la base de dades relacional que inclou tota la informació sintictica continguda en el Diccionario Critico Etimológico Castellano e Hispánico de J. Corominas i J. A. Pascual. Tot i que aquest diccionari conté un ampli ventall d'informacions històriques de cadascun dels temes, aquestes no es mostren de forma estructurada, per la qual cosa ha estat necessari estudiar i classificar tots aquells elements relacionats amb aspectes sintàctics. És a partir d'aquest estudi previ que s'han elaborat els diferents camps de la base de dades, els quals s'agrupen en cinc blocs temàtics: informació lemàtica; gramatical; sintàctica; altres aspectes relacionats; i observacions o comentaris rellevants fets per l'investigador. Aquesta base de dades no només reprodueix els continguts del diccionari, sinó que inclou diferents camps interpretatius. Per aquesta raó, Syntax. dbf representa una eina de treball fonamental per a tots aquells investigadors interessats en la sintaxi diacrònica de l'espanyol
Resumo:
With the constant grow of enterprises and the need to share information across departments and business areas becomes more critical, companies are turning to integration to provide a method for interconnecting heterogeneous, distributed and autonomous systems. Whether the sales application needs to interface with the inventory application, the procurement application connect to an auction site, it seems that any application can be made better by integrating it with other applications. Integration between applications can face several troublesome due the fact that applications may not have been designed and implemented having integration in mind. Regarding to integration issues, two tier software systems, composed by the database tier and by the “front-end” tier (interface), have shown some limitations. As a solution to overcome the two tier limitations, three tier systems were proposed in the literature. Thus, by adding a middle-tier (referred as middleware) between the database tier and the “front-end” tier (or simply referred application), three main benefits emerge. The first benefit is related with the fact that the division of software systems in three tiers enables increased integration capabilities with other systems. The second benefit is related with the fact that any modifications to the individual tiers may be carried out without necessarily affecting the other tiers and integrated systems and the third benefit, consequence of the others, is related with less maintenance tasks in software system and in all integrated systems. Concerning software development in three tiers, this dissertation focus on two emerging technologies, Semantic Web and Service Oriented Architecture, combined with middleware. These two technologies blended with middleware, which resulted in the development of Swoat framework (Service and Semantic Web Oriented ArchiTecture), lead to the following four synergic advantages: (1) allow the creation of loosely-coupled systems, decoupling the database from “front-end” tiers, therefore reducing maintenance; (2) the database schema is transparent to “front-end” tiers which are aware of the information model (or domain model) that describes what data is accessible; (3) integration with other heterogeneous systems is allowed by providing services provided by the middleware; (4) the service request by the “frontend” tier focus on ‘what’ data and not on ‘where’ and ‘how’ related issues, reducing this way the application development time by developers.
Resumo:
Princeton WordNet (WN.Pr) lexical database has motivated efficient compilations of bulky relational lexicons since its inception in the 1980's. The EuroWordNet project, the first multilingual initiative built upon WN.Pr, opened up ways of building individual wordnets, and interrelating them by means of the so-called Inter-Lingual-Index, an unstructured list of the WN.Pr synsets. Other important initiative, relying on a slightly different method of building multilingual wordnets, is the MultiWordNet project, where the key strategy is building language specific wordnets keeping as much as possible of the semantic relations available in the WN.Pr. This paper, in particular, stresses that the additional advantage of using WN.Pr lexical database as a resource for building wordnets for other languages is to explore possibilities of implementing an automatic procedure to map the WN.Pr conceptual relations as hyponymy, co-hyponymy, troponymy, meronymy, cause, and entailment onto the lexical database of the wordnet under construction, a viable possibility, for those are language-independent relations that hold between lexicalized concepts, not between lexical units. Accordingly, combining methods from both initiatives, this paper presents the ongoing implementation of the WN.Br lexical database and the aforementioned automation procedure illustrated with a sample of the automatic encoding of the hyponymy and co-hyponymy relations.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
One of the most demanding needs in cloud computing and big data is that of having scalable and highly available databases. One of the ways to attend these needs is to leverage the scalable replication techniques developed in the last decade. These techniques allow increasing both the availability and scalability of databases. Many replication protocols have been proposed during the last decade. The main research challenge was how to scale under the eager replication model, the one that provides consistency across replicas. This thesis provides an in depth study of three eager database replication systems based on relational systems: Middle-R, C-JDBC and MySQL Cluster and three systems based on In-Memory Data Grids: JBoss Data Grid, Oracle Coherence and Terracotta Ehcache. Thesis explore these systems based on their architecture, replication protocols, fault tolerance and various other functionalities. It also provides experimental analysis of these systems using state-of-the art benchmarks: TPC-C and TPC-W (for relational systems) and Yahoo! Cloud Serving Benchmark (In- Memory Data Grids). Thesis also discusses three Graph Databases, Neo4j, Titan and Sparksee based on their architecture and transactional capabilities and highlights the weaker transactional consistencies provided by these systems. It discusses an implementation of snapshot isolation in Neo4j graph database to provide stronger isolation guarantees for transactions.
Resumo:
In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25 320 structural domains and a further 160 000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153–165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.