982 resultados para Relational data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background The use of the knowledge produced by sciences to promote human health is the main goal of translational medicine. To make it feasible we need computational methods to handle the large amount of information that arises from bench to bedside and to deal with its heterogeneity. A computational challenge that must be faced is to promote the integration of clinical, socio-demographic and biological data. In this effort, ontologies play an essential role as a powerful artifact for knowledge representation. Chado is a modular ontology-oriented database model that gained popularity due to its robustness and flexibility as a generic platform to store biological data; however it lacks supporting representation of clinical and socio-demographic information. Results We have implemented an extension of Chado – the Clinical Module - to allow the representation of this kind of information. Our approach consists of a framework for data integration through the use of a common reference ontology. The design of this framework has four levels: data level, to store the data; semantic level, to integrate and standardize the data by the use of ontologies; application level, to manage clinical databases, ontologies and data integration process; and web interface level, to allow interaction between the user and the system. The clinical module was built based on the Entity-Attribute-Value (EAV) model. We also proposed a methodology to migrate data from legacy clinical databases to the integrative framework. A Chado instance was initialized using a relational database management system. The Clinical Module was implemented and the framework was loaded using data from a factual clinical research database. Clinical and demographic data as well as biomaterial data were obtained from patients with tumors of head and neck. We implemented the IPTrans tool that is a complete environment for data migration, which comprises: the construction of a model to describe the legacy clinical data, based on an ontology; the Extraction, Transformation and Load (ETL) process to extract the data from the source clinical database and load it in the Clinical Module of Chado; the development of a web tool and a Bridge Layer to adapt the web tool to Chado, as well as other applications. Conclusions Open-source computational solutions currently available for translational science does not have a model to represent biomolecular information and also are not integrated with the existing bioinformatics tools. On the other hand, existing genomic data models do not represent clinical patient data. A framework was developed to support translational research by integrating biomolecular information coming from different “omics” technologies with patient’s clinical and socio-demographic data. This framework should present some features: flexibility, compression and robustness. The experiments accomplished from a use case demonstrated that the proposed system meets requirements of flexibility and robustness, leading to the desired integration. The Clinical Module can be accessed in http://dcm.ffclrp.usp.br/caib/pg=iptrans webcite.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Obiettivo della tesi è analizzare e testare i principali approcci di Machine Learning applicabili in contesti semantici, partendo da algoritmi di Statistical Relational Learning, quali Relational Probability Trees, Relational Bayesian Classifiers e Relational Dependency Networks, per poi passare ad approcci basati su fattorizzazione tensori, in particolare CANDECOMP/PARAFAC, Tucker e RESCAL.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper showcases the field- and lab-documentation system developed for Kinneret Regional Project, an international archaeological expedition to the Northwestern shore of the Sea of Galilee (Israel) under the auspices of the University of Bern, the University of Helsinki, Leiden University and Wofford College. The core of the data management system is a fully relational, server-based database framework, which also includes time-based and static GIS services, stratigraphic analysis tools and fully indexed document/digital image archives. Data collection in the field is based on mobile, hand-held devices equipped with a custom-tailored stand-alone application. Comprehensive three-dimensional documentation of all finds and findings is achieved by means of total stations and/or high-precision GPS devices. All archaeological information retrieved in the field – including tachymetric data – is synched with the core system on the fly and thus immediately available for further processing in the field lab (within the local network) or for post-excavation analysis at remote institutions (via the WWW). Besides a short demonstration of the main functionalities, the paper also presents some of the key technologies used and illustrates usability aspects of the system’s individual components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Work on distributed data management commenced shortly after the introduction of the relational model in the mid-1970's. 1970's and 1980's were very active periods for the development of distributed relational database technology, and claims were made that in the following ten years centralized databases will be an “antique curiosity” and most organizations will move toward distributed database managers [1]. That prediction has certainly become true, and all commercial DBMSs today are distributed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El cálculo de relaciones binarias fue creado por De Morgan en 1860 para ser posteriormente desarrollado en gran medida por Peirce y Schröder. Tarski, Givant, Freyd y Scedrov demostraron que las álgebras relacionales son capaces de formalizar la lógica de primer orden, la lógica de orden superior así como la teoría de conjuntos. A partir de los resultados matemáticos de Tarski y Freyd, esta tesis desarrolla semánticas denotacionales y operacionales para la programación lógica con restricciones usando el álgebra relacional como base. La idea principal es la utilización del concepto de semántica ejecutable, semánticas cuya característica principal es el que la ejecución es posible utilizando el razonamiento estándar del universo semántico, este caso, razonamiento ecuacional. En el caso de este trabajo, se muestra que las álgebras relacionales distributivas con un operador de punto fijo capturan toda la teoría y metateoría estándar de la programación lógica con restricciones incluyendo los árboles utilizados en la búsqueda de demostraciones. La mayor parte de técnicas de optimización de programas, evaluación parcial e interpretación abstracta pueden ser llevadas a cabo utilizando las semánticas aquí presentadas. La demostración de la corrección de la implementación resulta extremadamente sencilla. En la primera parte de la tesis, un programa lógico con restricciones es traducido a un conjunto de términos relacionales. La interpretación estándar en la teoría de conjuntos de dichas relaciones coincide con la semántica estándar para CLP. Las consultas contra el programa traducido son llevadas a cabo mediante la reescritura de relaciones. Para concluir la primera parte, se demuestra la corrección y equivalencia operacional de esta nueva semántica, así como se define un algoritmo de unificación mediante la reescritura de relaciones. La segunda parte de la tesis desarrolla una semántica para la programación lógica con restricciones usando la teoría de alegorías—versión categórica del álgebra de relaciones—de Freyd. Para ello, se definen dos nuevos conceptos de Categoría Regular de Lawvere y _-Alegoría, en las cuales es posible interpretar un programa lógico. La ventaja fundamental que el enfoque categórico aporta es la definición de una máquina categórica que mejora e sistema de reescritura presentado en la primera parte. Gracias al uso de relaciones tabulares, la máquina modela la ejecución eficiente sin salir de un marco estrictamente formal. Utilizando la reescritura de diagramas, se define un algoritmo para el cálculo de pullbacks en Categorías Regulares de Lawvere. Los dominios de las tabulaciones aportan información sobre la utilización de memoria y variable libres, mientras que el estado compartido queda capturado por los diagramas. La especificación de la máquina induce la derivación formal de un juego de instrucciones eficiente. El marco categórico aporta otras importantes ventajas, como la posibilidad de incorporar tipos de datos algebraicos, funciones y otras extensiones a Prolog, a la vez que se conserva el carácter 100% declarativo de nuestra semántica. ABSTRACT The calculus of binary relations was introduced by De Morgan in 1860, to be greatly developed by Peirce and Schröder, as well as many others in the twentieth century. Using different formulations of relational structures, Tarski, Givant, Freyd, and Scedrov have shown how relation algebras can provide a variable-free way of formalizing first order logic, higher order logic and set theory, among other formal systems. Building on those mathematical results, we develop denotational and operational semantics for Constraint Logic Programming using relation algebra. The idea of executable semantics plays a fundamental role in this work, both as a philosophical and technical foundation. We call a semantics executable when program execution can be carried out using the regular theory and tools that define the semantic universe. Throughout this work, the use of pure algebraic reasoning is the basis of denotational and operational results, eliminating all the classical non-equational meta-theory associated to traditional semantics for Logic Programming. All algebraic reasoning, including execution, is performed in an algebraic way, to the point we could state that the denotational semantics of a CLP program is directly executable. Techniques like optimization, partial evaluation and abstract interpretation find a natural place in our algebraic models. Other properties, like correctness of the implementation or program transformation are easy to check, as they are carried out using instances of the general equational theory. In the first part of the work, we translate Constraint Logic Programs to binary relations in a modified version of the distributive relation algebras used by Tarski. Execution is carried out by a rewriting system. We prove adequacy and operational equivalence of the semantics. In the second part of the work, the relation algebraic approach is improved by using allegory theory, a categorical version of the algebra of relations developed by Freyd and Scedrov. The use of allegories lifts the semantics to typed relations, which capture the number of logical variables used by a predicate or program state in a declarative way. A logic program is interpreted in a _-allegory, which is in turn generated from a new notion of Regular Lawvere Category. As in the untyped case, program translation coincides with program interpretation. Thus, we develop a categorical machine directly from the semantics. The machine is based on relation composition, with a pullback calculation algorithm at its core. The algorithm is defined with the help of a notion of diagram rewriting. In this operational interpretation, types represent information about memory allocation and the execution mechanism is more efficient, thanks to the faithful representation of shared state by categorical projections. We finish the work by illustrating how the categorical semantics allows the incorporation into Prolog of constructs typical of Functional Programming, like abstract data types, and strict and lazy functions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La seguridad verificada es una metodología para demostrar propiedades de seguridad de los sistemas informáticos que se destaca por las altas garantías de corrección que provee. Los sistemas informáticos se modelan como programas probabilísticos y para probar que verifican una determinada propiedad de seguridad se utilizan técnicas rigurosas basadas en modelos matemáticos de los programas. En particular, la seguridad verificada promueve el uso de demostradores de teoremas interactivos o automáticos para construir demostraciones completamente formales cuya corrección es certificada mecánicamente (por ordenador). La seguridad verificada demostró ser una técnica muy efectiva para razonar sobre diversas nociones de seguridad en el área de criptografía. Sin embargo, no ha podido cubrir un importante conjunto de nociones de seguridad “aproximada”. La característica distintiva de estas nociones de seguridad es que se expresan como una condición de “similitud” entre las distribuciones de salida de dos programas probabilísticos y esta similitud se cuantifica usando alguna noción de distancia entre distribuciones de probabilidad. Este conjunto incluye destacadas nociones de seguridad de diversas áreas como la minería de datos privados, el análisis de flujo de información y la criptografía. Ejemplos representativos de estas nociones de seguridad son la indiferenciabilidad, que permite reemplazar un componente idealizado de un sistema por una implementación concreta (sin alterar significativamente sus propiedades de seguridad), o la privacidad diferencial, una noción de privacidad que ha recibido mucha atención en los últimos años y tiene como objetivo evitar la publicación datos confidenciales en la minería de datos. La falta de técnicas rigurosas que permitan verificar formalmente este tipo de propiedades constituye un notable problema abierto que tiene que ser abordado. En esta tesis introducimos varias lógicas de programa quantitativas para razonar sobre esta clase de propiedades de seguridad. Nuestra principal contribución teórica es una versión quantitativa de una lógica de Hoare relacional para programas probabilísticos. Las pruebas de correción de estas lógicas son completamente formalizadas en el asistente de pruebas Coq. Desarrollamos, además, una herramienta para razonar sobre propiedades de programas a través de estas lógicas extendiendo CertiCrypt, un framework para verificar pruebas de criptografía en Coq. Confirmamos la efectividad y aplicabilidad de nuestra metodología construyendo pruebas certificadas por ordendor de varios sistemas cuyo análisis estaba fuera del alcance de la seguridad verificada. Esto incluye, entre otros, una meta-construcción para diseñar funciones de hash “seguras” sobre curvas elípticas y algoritmos diferencialmente privados para varios problemas de optimización combinatoria de la literatura reciente. ABSTRACT The verified security methodology is an emerging approach to build high assurance proofs about security properties of computer systems. Computer systems are modeled as probabilistic programs and one relies on rigorous program semantics techniques to prove that they comply with a given security goal. In particular, it advocates the use of interactive theorem provers or automated provers to build fully formal machine-checked versions of these security proofs. The verified security methodology has proved successful in modeling and reasoning about several standard security notions in the area of cryptography. However, it has fallen short of covering an important class of approximate, quantitative security notions. The distinguishing characteristic of this class of security notions is that they are stated as a “similarity” condition between the output distributions of two probabilistic programs, and this similarity is quantified using some notion of distance between probability distributions. This class comprises prominent security notions from multiple areas such as private data analysis, information flow analysis and cryptography. These include, for instance, indifferentiability, which enables securely replacing an idealized component of system with a concrete implementation, and differential privacy, a notion of privacy-preserving data mining that has received a great deal of attention in the last few years. The lack of rigorous techniques for verifying these properties is thus an important problem that needs to be addressed. In this dissertation we introduce several quantitative program logics to reason about this class of security notions. Our main theoretical contribution is, in particular, a quantitative variant of a full-fledged relational Hoare logic for probabilistic programs. The soundness of these logics is fully formalized in the Coq proof-assistant and tool support is also available through an extension of CertiCrypt, a framework to verify cryptographic proofs in Coq. We validate the applicability of our approach by building fully machine-checked proofs for several systems that were out of the reach of the verified security methodology. These comprise, among others, a construction to build “safe” hash functions into elliptic curves and differentially private algorithms for several combinatorial optimization problems from the recent literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Desde el inicio de los tiempos el ser humano ha tenido la necesidad de comprender y analizar todo lo que nos rodea, para ello se ha valido de diferentes herramientas como las pinturas rupestres, la biblioteca de Alejandría, bastas colecciones de libros y actualmente una enorme cantidad de información informatizada. Todo esto siempre se ha almacenado, según la tecnología de la época lo permitía, con la esperanza de que fuera útil mediante su consulta y análisis. En la actualidad continúa ocurriendo lo mismo. Hasta hace unos años se ha realizado el análisis de información manualmente o mediante bases de datos relacionales. Ahora ha llegado el momento de una nueva tecnología, Big Data, con la cual se puede realizar el análisis de extensas cantidades de datos de todo tipo en tiempos relativamente pequeños. A lo largo de este libro, se estudiarán las características y ventajas de Big Data, además de realizar un estudio de la plataforma Hadoop. Esta es una plataforma basada en Java y puede realizar el análisis de grandes cantidades de datos de diferentes formatos y procedencias. Durante la lectura de estas páginas se irá dotando al lector de los conocimientos previos necesarios para su mejor comprensión, así como de ubicarle temporalmente en el desarrollo de este concepto, de su uso, las previsiones y la evolución y desarrollo que se prevé tenga en los próximos años. ABSTRACT. Since the beginning of time, human being was in need of understanding and analyzing everything around him. In order to do that, he used different media as cave paintings, Alexandria library, big amount of book collections and nowadays massive amount of computerized information. All this information was stored, depending on the age and technology capability, with the expectation of being useful though it consulting and analysis. Nowadays they keep doing the same. In the last years, they have been processing the information manually or using relational databases. Now it is time for a new technology, Big Data, which is able to analyze huge amount of data in a, relatively, small time. Along this book, characteristics and advantages of Big Data will be detailed, so as an introduction to Hadoop platform. This platform is based on Java and can perform the analysis of massive amount of data in different formats and coming from different sources. During this reading, the reader will be provided with the prior knowledge needed to it understanding, so as the temporal location, uses, forecast, evolution and growth in the next years.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, the authors introduce a novel mechanism for data management in a middleware for smart home control, where a relational database and semantic ontology storage are used at the same time in a Data Warehouse. An annotation system has been designed for instructing the storage format and location, registering new ontology concepts and most importantly, guaranteeing the Data Consistency between the two storage methods. For easing the data persistence process, the Data Access Object (DAO) pattern is applied and optimized to enhance the Data Consistency assurance. Finally, this novel mechanism provides an easy manner for the development of applications and their integration with BATMP. Finally, an application named "Parameter Monitoring Service" is given as an example for assessing the feasibility of the system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The HIV Reverse Transcriptase and Protease Sequence Database is an on-line relational database that catalogs evolutionary and drug-related sequence variation in the human immunodeficiency virus (HIV) reverse transcriptase (RT) and protease enzymes, the molecular targets of anti-HIV therapy (http://hivdb.stanford.edu). The database contains a compilation of nearly all published HIV RT and protease sequences, including submissions from International Collaboration databases and sequences published in journal articles. Sequences are linked to data about the source of the sequence sample and the antiretroviral drug treatment history of the individual from whom the isolate was obtained. During the past year 3500 sequences have been added and the data model has been expanded to include drug susceptibility data on sequenced isolates. Database content has also been integrated with didactic text and the output of two sequence analysis programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Plasmodium falciparum Genome Database (http://PlasmoDB.org) integrates sequence information, automated analyses and annotation data emerging from the P.falciparum genome sequencing consortium. To date, raw sequence coverage is available for >90% of the genome, and two chromosomes have been finished and annotated. Data in PlasmoDB are organized by chromosome (1–14), and can be accessed using a variety of tools for graphical and text-based browsing or downloaded in various file formats. The GUS (Genomics Unified Schema) implementation of PlasmoDB provides a multi-species genomic relational database, incorporating data from human and mouse, as well as P.falciparum. The relational schema uses a highly structured format to accommodate diverse data sets related to genomic sequence and gene expression. Tools have been designed to facilitate complex biological queries, including many that are specific to Plasmodium parasites and malaria as a disease. Additional projects seek to integrate genomic information with the rich data sets now becoming available for RNA transcription, protein expression, metabolic pathways, genetic and physical mapping, antigenic and population diversity, and phylogenetic relationships with other apicomplexan parasites. The overall goal of PlasmoDB is to facilitate Internet- and CD-ROM-based access to both finished and unfinished sequence information by the global malaria research community.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Drawing inferences from past experiences enables adaptive behavior in future situations. Inference has been shown to depend on hippocampal processes. Usually, inference is considered a deliberate and effortful mental act which happens during retrieval, and requires the focus of our awareness. Recent fMRI studies hint at the possibility that some forms of hippocampus-dependent inference can also occur during encoding and possibly also outside of awareness. Here, we sought to further explore the feasibility of hippocampal implicit inference, and specifically address the temporal evolution of implicit inference using intracranial EEG. Presurgical epilepsy patients with hippocampal depth electrodes viewed a sequence of word pairs, and judged the semantic fit between two words in each pair. Some of the word pairs entailed a common word (e.g.,‘winter - red’, ‘red - cat’) such that an indirect relation was established in following word pairs (e.g, ‘winter - cat’). The behavioral results suggested that drawing inference implicitly from past experience is feasible because indirect relations seemed to foster ‘fit’ judgments while the absence of indirect relations fostered 'do not fit' judgments, even though the participants were unaware of the indirect relations. A event-related potential (ERP) difference emerging 400 ms post-stimulus was evident in the hippocampus during encoding, suggesting that indirect relations were already established automatically during encoding of the overlapping word pairs. Further ERP differences emerged later post-stimulus (1500 ms), were modulated by the participants' responses and were evident during encoding and test. Furthermore, response-locked ERP effects were evident at test. These ERP effects could hence be a correlate of the interaction of implicit memory with decision-making. Together, the data map out a time-course in which the hippocampus automatically integrates memories from discrete but related episodes to implicitly influence future decision making.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The data structure of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. This research develops a methodology for evaluating, ex ante, the relative desirability of alternative data structures for end user queries. This research theorizes that the data structure that yields the lowest weighted average complexity for a representative sample of information requests is the most desirable data structure for end user queries. The theory was tested in an experiment that compared queries from two different relational database schemas. As theorized, end users querying the data structure associated with the less complex queries performed better Complexity was measured using three different Halstead metrics. Each of the three metrics provided excellent predictions of end user performance. This research supplies strong evidence that organizations can use complexity metrics to evaluate, ex ante, the desirability of alternate data structures. Organizations can use these evaluations to enhance the efficient and effective retrieval of information by creating data structures that minimize end user query complexity.