935 resultados para Data access


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we study query answering and rewriting in ontologybased data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic ELHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than ELHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ontology-based data access (OBDA) systems use ontologies to provide views over relational databases. Most of these systems work with ontologies implemented in description logic families of reduced expressiveness, what allows applying efficient query rewriting techniques for query answering. In this paper we describe a set of optimisations that are applicable with one of the most expressive families used in this context (ELHIO¬). Our resulting system exhibits a behaviour that is comparable to the one shown by systems that handle less expressive logics.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

- Mobile telecommunications markets are an important part of the European Commission’s strategy for the completion of the European Union Digital Single. The use of mobile telecommunications – particularly mobile data access – is growing and becoming an increasingly important input for the economy. - The EU currently does not have a unified mobile telecommunications market. The EU compares favourably to the United States in terms of prices and connection speed, but lags behind in terms of coverage of high-speed 4G wireless connections. -Europe’s long-term goal should be to make data access easier by increasing highspeed wireless coverage while keeping prices down for users. An increase in cross-border competition could help to achieve that goal. - The Commission has two important levers to help stimulate cross-border supply:(a) ensuring competition in intra-country mobile markets in order to provide an incentive for operators to expand into other jurisdictions, and (b) reducing mobile operators’ costs of expansion into multiple EU countries. The further development of policies on international roaming and radio spectrum management will be central to this effort.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Continuous Plankton Recorder (CPR) survey, operated by the Sir Alister Hardy Foundation for Ocean Science (SAHFOS), is the largest plankton monitoring programme in the world and has spanned > 70 yr. The dataset contains information from -200 000 samples, with over 2.3 million records of individual taxa. Here we outline the evolution of the CPR database through changes in technology, and how this has increased data access. Recent high-impact publications and the expanded role of CPR data in marine management demonstrate the usefulness of the dataset. We argue that solely supplying data to the research community is not sufficient in the current research climate; to promote wider use, additional tools need to be developed to provide visual representation and summary statistics. We outline 2 software visualisation tools, SAHFOS WinCPR and the digital CPR Atlas, which provide access to CPR data for both researchers and non-plankton specialists. We also describe future directions of the database, data policy and the development of visualisation tools. We believe that the approach at SAHFOS to increase data accessibility and provide new visualisation tools has enhanced awareness of the data and led to the financial security of the organisation; it also provides a good model of how long-term monitoring programmes can evolve to help secure their future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The purpose of this study was to develop, explicate, and validate a comprehensive model in order to more effectively assess community injury prevention needs, plan and target efforts, identify potential interventions, and provide a framework for an outcome-based evaluation of the effectiveness of interventions. A systems model approach was developed to conceptualize the major components of inputs, efforts, outcomes and feedback within a community setting. Profiling of multiple data sources demonstrated a community feedback mechanism that increased awareness of priority issues and elicited support from traditional as well as non-traditional injury prevention partners. Injury countermeasures including education, enforcement, engineering, and economic incentives were presented for their potential synergistic effect impacting on knowledge, attitudes, or behaviors of a targeted population. Levels of outcome data were classified into ultimate, intermediate and immediate indicators to assist with determining the effectiveness of intervention efforts. A collaboration between business and health care was successful in achieving data access and use of an emergency department level of injury data for monitoring of the impact of community interventions. Evaluation of injury events and preventive efforts within the context of a dynamic community systems environment was applied to a study community with examples detailing actual profiling and trending of injuries. The resulting model of community injury prevention was validated using a community focus group, community injury prevention coordinators, and injury prevention national experts. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. ^ This thesis describes a heterogeneous database system being developed at High-performance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii) a framework for intelligent computing and communication on the Internet applying the concepts of our work. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds - parent-tochild or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5-54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3-127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given the growing demand for the development of mobile applications, driven by use increasingly common in smartphones and tablets grew in society the need for remote data access in full in the use of mobile application without connectivity environments where there is no provision network access at all times. Given this reality, this work proposes a framework that present main functions are the provision of a persistence mechanism, replication and data synchronization, contemplating the creation, deletion, update and display persisted or requested data, even though the mobile device without connectivity with the network. From the point of view of the architecture and programming practices, it reflected in defining strategies for the main functions of the framework are met. Through a controlled study was to validate the solution proposal, being found as the gains in reducing the number of lines code and the amount of time required to perform the development of an application without there being significant increase for the operations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given the growing demand for the development of mobile applications, driven by use increasingly common in smartphones and tablets grew in society the need for remote data access in full in the use of mobile application without connectivity environments where there is no provision network access at all times. Given this reality, this work proposes a framework that present main functions are the provision of a persistence mechanism, replication and data synchronization, contemplating the creation, deletion, update and display persisted or requested data, even though the mobile device without connectivity with the network. From the point of view of the architecture and programming practices, it reflected in defining strategies for the main functions of the framework are met. Through a controlled study was to validate the solution proposal, being found as the gains in reducing the number of lines code and the amount of time required to perform the development of an application without there being significant increase for the operations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A well-documented, publicly available, global data set of surface ocean carbon dioxide (CO2) parameters has been called for by international groups for nearly two decades. The Surface Ocean CO2 Atlas (SOCAT) project was initiated by the international marine carbon science community in 2007 with the aim of providing a comprehensive, publicly available, regularly updated, global data set of marine surface CO2, which had been subject to quality control (QC). Many additional CO2 data, not yet made public via the Carbon Dioxide Information Analysis Center (CDIAC), were retrieved from data originators, public websites and other data centres. All data were put in a uniform format following a strict protocol. Quality control was carried out according to clearly defined criteria. Regional specialists performed the quality control, using state-of-the-art web-based tools, specially developed for accomplishing this global team effort. SOCAT version 1.5 was made public in September 2011 and holds 6.3 million quality controlled surface CO2 data points from the global oceans and coastal seas, spanning four decades (1968-2007). Three types of data products are available: individual cruise files, a merged complete data set and gridded products. With the rapid expansion of marine CO2 data collection and the importance of quantifying net global oceanic CO2 uptake and its changes, sustained data synthesis and data access are priorities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data access and analyses were funded by Boehringer Ingelheim, who played no role in the conduct or reporting of the study.