9 resultados para Distributed data access

em Universidade do Minho


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed data aggregation is an important task, allowing the de- centralized determination of meaningful global properties, that can then be used to direct the execution of other applications. The resulting val- ues result from the distributed computation of functions like count, sum and average. Some application examples can found to determine the network size, total storage capacity, average load, majorities and many others. In the last decade, many di erent approaches have been pro- posed, with di erent trade-o s in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of ag- gregation algorithms, it can be di cult and time consuming to determine which techniques will be more appropriate to use in speci c settings, jus- tifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally de nes the concept of aggrega- tion, characterizing the di erent types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Large scale distributed data stores rely on optimistic replication to scale and remain highly available in the face of net work partitions. Managing data without coordination results in eventually consistent data stores that allow for concurrent data updates. These systems often use anti-entropy mechanisms (like Merkle Trees) to detect and repair divergent data versions across nodes. However, in practice hash-based data structures are too expensive for large amounts of data and create too many false conflicts. Another aspect of eventual consistency is detecting write conflicts. Logical clocks are often used to track data causality, necessary to detect causally concurrent writes on the same key. However, there is a nonnegligible metadata overhead per key, which also keeps growing with time, proportional with the node churn rate. Another challenge is deleting keys while respecting causality: while the values can be deleted, perkey metadata cannot be permanently removed without coordination. Weintroduceanewcausalitymanagementframeworkforeventuallyconsistentdatastores,thatleveragesnodelogicalclocks(BitmappedVersion Vectors) and a new key logical clock (Dotted Causal Container) to provides advantages on multiple fronts: 1) a new efficient and lightweight anti-entropy mechanism; 2) greatly reduced per-key causality metadata size; 3) accurate key deletes without permanent metadata.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia de Gestão e Sistemas de Informação

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data traces, consisting of logs about the use of mobile and wireless networks, have been used to study the statistics of encounters between mobile nodes, in an attempt to predict the performance of opportunistic networks. Understanding the role and potential of mobile devices as relaying nodes in message dissemination and delivery depends on the knowledge about patterns and number of encounters among nodes. Data traces about the use of WiFi networks are widely available and can be used to extract large datasets of encounters between nodes. However, these logs only capture indirect encounters between nodes, and the resulting encounters datasets might not realistically represent the spatial and temporal behaviour of nodes. This paper addresses the impact of overlapping between the coverage areas of different Access Points of WiFi networks in extracting encounters datasets from the usage logs. Simulation and real-world experimental results show that indirect encounter traces extracted directly from these logs strongly underestimate the opportunities for direct node-to- node message exchange in opportunistic networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We are living in the era of Big Data. A time which is characterized by the continuous creation of vast amounts of data, originated from different sources, and with different formats. First, with the rise of the social networks and, more recently, with the advent of the Internet of Things (IoT), in which everyone and (eventually) everything is linked to the Internet, data with enormous potential for organizations is being continuously generated. In order to be more competitive, organizations want to access and explore all the richness that is present in those data. Indeed, Big Data is only as valuable as the insights organizations gather from it to make better decisions, which is the main goal of Business Intelligence. In this paper we describe an experiment in which data obtained from a NoSQL data source (database technology explicitly developed to deal with the specificities of Big Data) is used to feed a Business Intelligence solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de Mestrado em Engenharia Informática

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OpenAIRE supports the European Commission Open Access policy by providing an infrastructure for researchers to comply with the European Union Open Access mandate. The current OpenAIRE infrastructure and services, resulting from OpenAIRE and OpenAIREplus FP7 projects, builds on Open Access research results from a wide range of repositories and other data sources: institutional or thematic publication repositories, Open Access journals, data repositories, Current Research Information Systems and aggregators. (...)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study the problem of privacy-preserving proofs on authenticated data, where a party receives data from a trusted source and is requested to prove computations over the data to third parties in a correct and private way, i.e., the third party learns no information on the data but is still assured that the claimed proof is valid. Our work particularly focuses on the challenging requirement that the third party should be able to verify the validity with respect to the specific data authenticated by the source — even without having access to that source. This problem is motivated by various scenarios emerging from several application areas such as wearable computing, smart metering, or general business-to-business interactions. Furthermore, these applications also demand any meaningful solution to satisfy additional properties related to usability and scalability. In this paper, we formalize the above three-party model, discuss concrete application scenarios, and then we design, build, and evaluate ADSNARK, a nearly practical system for proving arbitrary computations over authenticated data in a privacy-preserving manner. ADSNARK improves significantly over state-of-the-art solutions for this model. For instance, compared to corresponding solutions based on Pinocchio (Oakland’13), ADSNARK achieves up to 25× improvement in proof-computation time and a 20× reduction in prover storage space.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Patient blood pressure is an important vital signal to the physicians take a decision and to better understand the patient condition. In Intensive Care Units is possible monitoring the blood pressure due the fact of the patient being in continuous monitoring through bedside monitors and the use of sensors. The intensivist only have access to vital signs values when they look to the monitor or consult the values hourly collected. Most important is the sequence of the values collected, i.e., a set of highest or lowest values can signify a critical event and bring future complications to a patient as is Hypotension or Hypertension. This complications can leverage a set of dangerous diseases and side-effects. The main goal of this work is to predict the probability of a patient has a blood pressure critical event in the next hours by combining a set of patient data collected in real-time and using Data Mining classification techniques. As output the models indicate the probability (%) of a patient has a Blood Pressure Critical Event in the next hour. The achieved results showed to be very promising, presenting sensitivity around of 95%.