937 resultados para Distributed data


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In wireless sensor networks where nodes are powered by batteries, it is critical to prolong the network lifetime by minimizing the energy consumption of each node. In this paper, the cooperative multiple-input-multiple-output (MIMO) and data-aggregation techniques are jointly adopted to reduce the energy consumption per bit in wireless sensor networks by reducing the amount of data for transmission and better using network resources through cooperative communication. For this purpose, we derive a new energy model that considers the correlation between data generated by nodes and the distance between them for a cluster-based sensor network by employing the combined techniques. Using this model, the effect of the cluster size on the average energy consumption per node can be analyzed. It is shown that the energy efficiency of the network can significantly be enhanced in cooperative MIMO systems with data aggregation, compared with either cooperative MIMO systems without data aggregation or data-aggregation systems without cooperative MIMO, if sensor nodes are properly clusterized. Both centralized and distributed data-aggregation schemes for the cooperating nodes to exchange and compress their data are also proposed and appraised, which lead to diverse impacts of data correlation on the energy performance of the integrated cooperative MIMO and data-aggregation systems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Cloud computing enables independent end users and applications to share data and pooled resources, possibly located in geographically distributed Data Centers, in a fully transparent way. This need is particularly felt by scientific applications to exploit distributed resources in efficient and scalable way for the processing of big amount of data. This paper proposes an open so- lution to deploy a Platform as a service (PaaS) over a set of multi- site data centers by applying open source virtualization tools to facilitate operation among virtual machines while optimizing the usage of distributed resources. An experimental testbed is set up in Openstack environment to obtain evaluations with different types of TCP sample connections to demonstrate the functionality of the proposed solution and to obtain throughput measurements in relation to relevant design parameters.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The last decades have been characterized by a continuous adoption of IT solutions in the healthcare sector, which resulted in the proliferation of tremendous amounts of data over heterogeneous systems. Distinct data types are currently generated, manipulated, and stored, in the several institutions where patients are treated. The data sharing and an integrated access to this information will allow extracting relevant knowledge that can lead to better diagnostics and treatments. This thesis proposes new integration models for gathering information and extracting knowledge from multiple and heterogeneous biomedical sources. The scenario complexity led us to split the integration problem according to the data type and to the usage specificity. The first contribution is a cloud-based architecture for exchanging medical imaging services. It offers a simplified registration mechanism for providers and services, promotes remote data access, and facilitates the integration of distributed data sources. Moreover, it is compliant with international standards, ensuring the platform interoperability with current medical imaging devices. The second proposal is a sensor-based architecture for integration of electronic health records. It follows a federated integration model and aims to provide a scalable solution to search and retrieve data from multiple information systems. The last contribution is an open architecture for gathering patient-level data from disperse and heterogeneous databases. All the proposed solutions were deployed and validated in real world use cases.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the CERN LHC program underway, there has been an acceleration of data growth in the High Energy Physics (HEP) field and the usage of Machine Learning (ML) in HEP will be critical during the HL-LHC program when the data that will be produced will reach the exascale. ML techniques have been successfully used in many areas of HEP nevertheless, the development of a ML project and its implementation for production use is a highly time-consuming task and requires specific skills. Complicating this scenario is the fact that HEP data is stored in ROOT data format, which is mostly unknown outside of the HEP community. The work presented in this thesis is focused on the development of a ML as a Service (MLaaS) solution for HEP, aiming to provide a cloud service that allows HEP users to run ML pipelines via HTTP calls. These pipelines are executed by using the MLaaS4HEP framework, which allows reading data, processing data, and training ML models directly using ROOT files of arbitrary size from local or distributed data sources. Such a solution provides HEP users non-expert in ML with a tool that allows them to apply ML techniques in their analyses in a streamlined manner. Over the years the MLaaS4HEP framework has been developed, validated, and tested and new features have been added. A first MLaaS solution has been developed by automatizing the deployment of a platform equipped with the MLaaS4HEP framework. Then, a service with APIs has been developed, so that a user after being authenticated and authorized can submit MLaaS4HEP workflows producing trained ML models ready for the inference phase. A working prototype of this service is currently running on a virtual machine of INFN-Cloud and is compliant to be added to the INFN Cloud portfolio of services.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Internet of Vehicles (IoV) paradigm has emerged in recent times, where with the support of technologies like the Internet of Things and V2X , Vehicular Users (VUs) can access different services through internet connectivity. With the support of 6G technology, the IoV paradigm will evolve further and converge into a fully connected and intelligent vehicular system. However, this brings new challenges over dynamic and resource-constrained vehicular systems, and advanced solutions are demanded. This dissertation analyzes the future 6G enabled IoV systems demands, corresponding challenges, and provides various solutions to address them. The vehicular services and application requests demands proper data processing solutions with the support of distributed computing environments such as Vehicular Edge Computing (VEC). While analyzing the performance of VEC systems it is important to take into account the limited resources, coverage, and vehicular mobility into account. Recently, Non terrestrial Networks (NTN) have gained huge popularity for boosting the coverage and capacity of terrestrial wireless networks. Integrating such NTN facilities into the terrestrial VEC system can address the above mentioned challenges. Additionally, such integrated Terrestrial and Non-terrestrial networks (T-NTN) can also be considered to provide advanced intelligent solutions with the support of the edge intelligence paradigm. In this dissertation, we proposed an edge computing-enabled joint T-NTN-based vehicular system architecture to serve VUs. Next, we analyze the terrestrial VEC systems performance for VUs data processing problems and propose solutions to improve the performance in terms of latency and energy costs. Next, we extend the scenario toward the joint T-NTN system and address the problem of distributed data processing through ML-based solutions. We also proposed advanced distributed learning frameworks with the support of a joint T-NTN framework with edge computing facilities. In the end, proper conclusive remarks and several future directions are provided for the proposed solutions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The overall goal of the REMPLI project is to design and implement a communication infrastructure for distributed data acquisition and remote control operations using the power grid as the communication medium. The primary target application is remote meter reading with high time resolution, where the meters can be energy, heat, gas, or water meters. The users of the system (e.g. utility companies) will benefit from the REMPLI system by gaining more detailed information about how energy is consumed by the end-users. In this context, the power-line communication (PLC) is deployed to cover the distance between utility company’s Private Network and the end user. This document specifies a protocol for real-time PLC, in the framework of the REMPLI project. It mainly comprises the Network Layer and Data Link Layer. The protocol was designed having into consideration the specific aspects of the network: different network typologies (star, tree, ring, multiple paths), dynamic changes in network topology (due to network maintenance, hazards, etc.), communication lines strongly affected by noise.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In global scientific experiments with collaborative scenarios involving multinational teams there are big challenges related to data access, namely data movements are precluded to other regions or Clouds due to the constraints on latency costs, data privacy and data ownership. Furthermore, each site is processing local data sets using specialized algorithms and producing intermediate results that are helpful as inputs to applications running on remote sites. This paper shows how to model such collaborative scenarios as a scientific workflow implemented with AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic), a decentralized framework offering a feasible solution to run the workflow activities on distributed data centers in different regions without the need of large data movements. The AWARD workflow activities are independently monitored and dynamically reconfigured and steering by different users, namely by hot-swapping the algorithms to enhance the computation results or by changing the workflow structure to support feedback dependencies where an activity receives feedback output from a successor activity. A real implementation of one practical scenario and its execution on multiple data centers of the Amazon Cloud is presented including experimental results with steering by multiple users.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Trabalho de Projecto apresentado como requisito parcial para obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A mesura que la investigació depèn cada vegada més dels computadors, l'emmagatzematge de dades comença a convertir-se en un recurs escàs per als projectes, i suposa una gran part del cost total. Alguns projectes intenten resoldre aquest problema emprant emmagatzament distribuït. És doncs necessari que alguns centres proveeixin de grans quantitats d'emmagatzematge massiu de baix cost basat en cintes magnètiques. L'inconvenient d'aquesta solució és que el rendiment disminueix, particularment a l'hora de tractar-se de grans quantitats d'arxius petits. El nostre objectiu és crear un híbrid entre un sistema d'alt cost i rendiment basat en discs, i un de baix cost i rendiment basat en cintes. Per això, unirem dCache, un sistema d'emmagatzematge distribuït, amb Castor, un sistema d'emmagatzematge jeràrquic, creant sistemes de fitxers virtuals que contindran grans quantitats d'arxius petits per millorar el rendiment global del sistema.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

PURPOSE: To investigate the prevalence of chromosomal abnormalities in couples with two or more recurrent first trimester miscarriages of unknown cause. METHODS: The study was conducted on 151 women and 94 partners who had an obstetrical history of two or more consecutive first trimester abortions (1-12 weeks of gestation). The controls were 100 healthy women without a history of pregnancy loss. Chromosomal analysis was performed on peripheral blood lymphocytes cultured for 72 hours, using Trypsin-Giemsa (GTG) banding. In all cases, at least 30 metaphases were analyzed and 2 karyotypes were prepared, using light microscopy. The statistical analysis was performed using the Student t-test for normally distributed data and the Mann-Whitney test for non-parametric data. The Kruskal-Wallis test or Analysis of Variance was used to compare the mean values between three or more groups. The software used was Statistical Package for the Social Sciences (SPSS), version 17.0. RESULTS: The frequency of chromosomal abnormalities in women with recurrent miscarriages was 7.3%, including 4.7% with X-chromosome mosaicism, 2% with reciprocal translocations and 0.6% with Robertsonian translocations. A total of 2.1% of the partners of women with recurrent miscarriages had chromosomal abnormalities, including 1% with X-chromosome mosaicism and 1% with inversions. Among the controls, 1% had mosaicism. CONCLUSION: An association between chromosomal abnormalities and recurrent miscarriage in the first trimester of pregnancy (OR=7.7; 95%CI 1.2--170.5) was observed in the present study. Etiologic identification of genetic factors represents important clinical information for genetic counseling and orientation of the couple about the risk for future pregnancies and decreases the number of investigations needed to elucidate the possible causes of miscarriages.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The evolution of wireless sensor network technology has enabled us to develop advanced systems for real time monitoring. In the present scenario wireless sensor networks are increasingly being used for precision agriculture. The advantages of using wireless sensor networks in agriculture are distributed data collection and monitoring, monitor and control of climate, irrigation and nutrient supply. Hence decreasing the cost of production and increasing the efficiency of production.This paper describes the application of wireless sensor network for crop monitoring in the paddy fields of kuttand, a region of Kerala, the southern state of India.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The evolution of wireless sensor network technology has enabled us to develop advanced systems for real time monitoring. In the present scenario wireless sensor networks are increasingly being used for precision agriculture. The advantages of using wireless sensor networks in agriculture are distributed data collection and monitoring, monitor and control of climate, irrigation and nutrient supply. Hence decreasing the cost of production and increasing the efficiency of production. This paper describes the development and deployment of wireless sensor network for crop monitoring in the paddy fields of Kuttanad, a region of Kerala, the southern state of India.