874 resultados para Distributed data access
Resumo:
In this paper we study query answering and rewriting in ontologybased data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic ELHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than ELHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.
Resumo:
Current fusion devices consist of multiple diagnostics and hundreds or even thousands of signals. This situation forces on multiple occasions to use distributed data acquisition systems as the best approach. In this type of distributed systems, one of the most important issues is the synchronization between signals, so that it is possible to have a temporal correlation as accurate as possible between the acquired samples of all channels. In last decades, many fusion devices use different types of video cameras to provide inside views of the vessel during operations and to monitor plasma behavior. The synchronization between each video frame and the rest of the different signals acquired from any other diagnostics is essential in order to know correctly the plasma evolution, since it is possible to analyze jointly all the information having accurate knowledge of their temporal correlation. The developed system described in this paper allows timestamping image frames in a real-time acquisition and processing system using 1588 clock distribution. The system has been implemented using FPGA based devices together with a 1588 synchronized timing card (see Fig.1). The solution is based on a previous system [1] that allows image acquisition and real-time image processing based on PXIe technology. This architecture is fully compatible with the ITER Fast Controllers [2] and offers integration with EPICS to control and monitor the entire system. However, this set-up is not able to timestamp the frames acquired since the frame grabber module does not present any type of timing input (IRIG-B, GPS, PTP). To solve this lack, an IEEE1588 PXI timing device its used to provide an accurate way to synchronize distributed data acquisition systems using the Precision Time Protocol (PTP) IEEE 1588 2008 standard. This local timing device can be connected to a master clock device for global synchronization. The timing device has a buffer timestamp for each PXI trigger line and requires tha- a software application assigns each frame the corresponding timestamp. The previous action is critical and cannot be achieved if the frame rate is high. To solve this problem, it has been designed a solution that distributes the clock from the IEEE 1588 timing card to all FlexRIO devices [3]. This solution uses two PXI trigger lines that provide the capacity to assign timestamps to every frame acquired and register events by hardware in a deterministic way. The system provides a solution for timestamping frames to synchronize them with the rest of the different signals.
Resumo:
Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.
Resumo:
Ontology-based data access (OBDA) systems use ontologies to provide views over relational databases. Most of these systems work with ontologies implemented in description logic families of reduced expressiveness, what allows applying efficient query rewriting techniques for query answering. In this paper we describe a set of optimisations that are applicable with one of the most expressive families used in this context (ELHIO¬). Our resulting system exhibits a behaviour that is comparable to the one shown by systems that handle less expressive logics.
Resumo:
El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.
Resumo:
Introdução: A obesidade é um dos grandes problemas de Saúde Pública e atinge níveis epidêmicos em grande parte do mundo. A maioria dos indivíduos com excesso de peso são mulheres, no Brasil o tamanho desta população também é expressivo, as em idade fértil são as que apresentam maior risco para o desenvolvimento da obesidade, o que está associado ao ganho de peso excessivo durante a gestação e a retenção de peso após o nascimento. O excesso de peso materno está relacionado a desfechos negativos para saúde materno-infantil. Objetivo: Analisar o peso gestacional e desfechos perinatais em mulheres da região sudeste do Brasil. Método: estudo transversal, com a utilização de dados provenientes de uma coorte nacional, com base hospitalar denominada: Nascer no Brasil: Inquérito Nacional sobre Parto e Nascimento, inquérito realizado no período de 2011 e 2012.Partindo da amostra inicial total do Sudeste composta por 10.154 mulheres entrevistadas e considerando os fatores de inclusão e exclusão para esta pesquisa, chegou-se a uma amostra de 3.405 binômios (mãe /recém-nascido).As variáveis estudadas foram ganho de peso, idade materna, peso pré-gestacional, Índice de Massa Corporal inicial e final, idade gestacional, tipo de parto e peso ao nascer. Análise foi realizada através das medidas de tendência central. Foi utilizado teste de Mann-Whitney para dados de distribuição normal e coeficiente de Pearson para variáveis contínuas. Foram considerados como significante os resultados com um p a 0,05. Resultados: A maioria das participantes apresentou faixa etária entre 21 e 30 anos, os nascimentos ocorreram entre a 38ª e 39ª semana gestacional, e seus recém-nascidos tiveram peso mediano de 3.219 g. Grande parte das pesquisadas (61,04 por cento ) iniciaram a gestação com um estado nutricional considerado adequado e 31,51 por cento apresentavam excesso de peso anterior à gestação. O ganho de peso excessivo ocorreu em todas as categorias de IMC pré-gestacional representando 49,6 por cento da população total estudada. O peso anterior à gestação apresentou elevada correlação com ganho de peso total ao final da gestação. Também foi observada influência do ganho de peso na gestação com a via de parto, idade gestacional e peso do bebê ao nascer. Conclusão: A maioria da população iniciou a gestação com estado nutricional adequado, porém, houve ganho de peso excessivo considerável em todas as categorias de IMC, este influenciou na via de parto onde a maioria aconteceu por operação cesariana e no peso ao nascer. O estado nutricional inicial influencia fortemente o estado nutricional ao final da gestação. Por isto, é importante que os programas de intervenção atuem em todas as etapas deste período, inclusive na conscientização da importância de um peso adequado anterior a concepção. Além de promover ações que auxiliem nos cuidados quanto ao ganho de peso na gestação.
Resumo:
- Mobile telecommunications markets are an important part of the European Commission’s strategy for the completion of the European Union Digital Single. The use of mobile telecommunications – particularly mobile data access – is growing and becoming an increasingly important input for the economy. - The EU currently does not have a unified mobile telecommunications market. The EU compares favourably to the United States in terms of prices and connection speed, but lags behind in terms of coverage of high-speed 4G wireless connections. -Europe’s long-term goal should be to make data access easier by increasing highspeed wireless coverage while keeping prices down for users. An increase in cross-border competition could help to achieve that goal. - The Commission has two important levers to help stimulate cross-border supply:(a) ensuring competition in intra-country mobile markets in order to provide an incentive for operators to expand into other jurisdictions, and (b) reducing mobile operators’ costs of expansion into multiple EU countries. The further development of policies on international roaming and radio spectrum management will be central to this effort.
Resumo:
The Continuous Plankton Recorder (CPR) survey, operated by the Sir Alister Hardy Foundation for Ocean Science (SAHFOS), is the largest plankton monitoring programme in the world and has spanned > 70 yr. The dataset contains information from -200 000 samples, with over 2.3 million records of individual taxa. Here we outline the evolution of the CPR database through changes in technology, and how this has increased data access. Recent high-impact publications and the expanded role of CPR data in marine management demonstrate the usefulness of the dataset. We argue that solely supplying data to the research community is not sufficient in the current research climate; to promote wider use, additional tools need to be developed to provide visual representation and summary statistics. We outline 2 software visualisation tools, SAHFOS WinCPR and the digital CPR Atlas, which provide access to CPR data for both researchers and non-plankton specialists. We also describe future directions of the database, data policy and the development of visualisation tools. We believe that the approach at SAHFOS to increase data accessibility and provide new visualisation tools has enhanced awareness of the data and led to the financial security of the organisation; it also provides a good model of how long-term monitoring programmes can evolve to help secure their future.
Resumo:
Systems biology is based on computational modelling and simulation of large networks of interacting components. Models may be intended to capture processes, mechanisms, components and interactions at different levels of fidelity. Input data are often large and geographically disperse, and may require the computation to be moved to the data, not vice versa. In addition, complex system-level problems require collaboration across institutions and disciplines. Grid computing can offer robust, scaleable solutions for distributed data, compute and expertise. We illustrate some of the range of computational and data requirements in systems biology with three case studies: one requiring large computation but small data (orthologue mapping in comparative genomics), a second involving complex terabyte data (the Visible Cell project) and a third that is both computationally and data-intensive (simulations at multiple temporal and spatial scales). Authentication, authorisation and audit systems are currently not well scalable and may present bottlenecks for distributed collaboration particularly where outcomes may be commercialised. Challenges remain in providing lightweight standards to facilitate the penetration of robust, scalable grid-type computing into diverse user communities to meet the evolving demands of systems biology.
Resumo:
We consider a variation of the prototype combinatorial optimization problem known as graph colouring. Our optimization goal is to colour the vertices of a graph with a fixed number of colours, in a way to maximize the number of different colours present in the set of nearest neighbours of each given vertex. This problem, which we pictorially call palette-colouring, has been recently addressed as a basic example of a problem arising in the context of distributed data storage. Even though it has not been proved to be NP-complete, random search algorithms find the problem hard to solve. Heuristics based on a naive belief propagation algorithm are observed to work quite well in certain conditions. In this paper, we build upon the mentioned result, working out the correct belief propagation algorithm, which needs to take into account the many-body nature of the constraints present in this problem. This method improves the naive belief propagation approach at the cost of increased computational effort. We also investigate the emergence of a satisfiable-to-unsatisfiable 'phase transition' as a function of the vertex mean degree, for different ensembles of sparse random graphs in the large size ('thermodynamic') limit.
Resumo:
The purpose of this study was to develop, explicate, and validate a comprehensive model in order to more effectively assess community injury prevention needs, plan and target efforts, identify potential interventions, and provide a framework for an outcome-based evaluation of the effectiveness of interventions. A systems model approach was developed to conceptualize the major components of inputs, efforts, outcomes and feedback within a community setting. Profiling of multiple data sources demonstrated a community feedback mechanism that increased awareness of priority issues and elicited support from traditional as well as non-traditional injury prevention partners. Injury countermeasures including education, enforcement, engineering, and economic incentives were presented for their potential synergistic effect impacting on knowledge, attitudes, or behaviors of a targeted population. Levels of outcome data were classified into ultimate, intermediate and immediate indicators to assist with determining the effectiveness of intervention efforts. A collaboration between business and health care was successful in achieving data access and use of an emergency department level of injury data for monitoring of the impact of community interventions. Evaluation of injury events and preventive efforts within the context of a dynamic community systems environment was applied to a study community with examples detailing actual profiling and trending of injuries. The resulting model of community injury prevention was validated using a community focus group, community injury prevention coordinators, and injury prevention national experts. ^
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds - parent-tochild or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5-54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3-127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.
Resumo:
Given the growing demand for the development of mobile applications, driven by use increasingly common in smartphones and tablets grew in society the need for remote data access in full in the use of mobile application without connectivity environments where there is no provision network access at all times. Given this reality, this work proposes a framework that present main functions are the provision of a persistence mechanism, replication and data synchronization, contemplating the creation, deletion, update and display persisted or requested data, even though the mobile device without connectivity with the network. From the point of view of the architecture and programming practices, it reflected in defining strategies for the main functions of the framework are met. Through a controlled study was to validate the solution proposal, being found as the gains in reducing the number of lines code and the amount of time required to perform the development of an application without there being significant increase for the operations.
Resumo:
Given the growing demand for the development of mobile applications, driven by use increasingly common in smartphones and tablets grew in society the need for remote data access in full in the use of mobile application without connectivity environments where there is no provision network access at all times. Given this reality, this work proposes a framework that present main functions are the provision of a persistence mechanism, replication and data synchronization, contemplating the creation, deletion, update and display persisted or requested data, even though the mobile device without connectivity with the network. From the point of view of the architecture and programming practices, it reflected in defining strategies for the main functions of the framework are met. Through a controlled study was to validate the solution proposal, being found as the gains in reducing the number of lines code and the amount of time required to perform the development of an application without there being significant increase for the operations.