937 resultados para Data structures (Computer science)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed systems are widely used for solving large-scale and data-intensive computing problems, including all-to-all comparison (ATAC) problems. However, when used for ATAC problems, existing computational frameworks such as Hadoop focus on load balancing for allocating comparison tasks, without careful consideration of data distribution and storage usage. While Hadoop-based solutions provide users with simplicity of implementation, their inherent MapReduce computing pattern does not match the ATAC pattern. This leads to load imbalances and poor data locality when Hadoop's data distribution strategy is used for ATAC problems. Here we present a data distribution strategy which considers data locality, load balancing and storage savings for ATAC computing problems in homogeneous distributed systems. A simulated annealing algorithm is developed for data distribution and task scheduling. Experimental results show a significant performance improvement for our approach over Hadoop-based solutions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The increase in data center dependent services has made energy optimization of data centers one of the most exigent challenges in today's Information Age. The necessity of green and energy-efficient measures is very high for reducing carbon footprint and exorbitant energy costs. However, inefficient application management of data centers results in high energy consumption and low resource utilization efficiency. Unfortunately, in most cases, deploying an energy-efficient application management solution inevitably degrades the resource utilization efficiency of the data centers. To address this problem, a Penalty-based Genetic Algorithm (GA) is presented in this paper to solve a defined profile-based application assignment problem whilst maintaining a trade-off between the power consumption performance and resource utilization performance. Case studies show that the penalty-based GA is highly scalable and provides 16% to 32% better solutions than a greedy algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the past few years, the virtual machine (VM) placement problem has been studied intensively and many algorithms for the VM placement problem have been proposed. However, those proposed VM placement algorithms have not been widely used in today's cloud data centers as they do not consider the migration cost from current VM placement to the new optimal VM placement. As a result, the gain from optimizing VM placement may be less than the loss of the migration cost from current VM placement to the new VM placement. To address this issue, this paper presents a penalty-based genetic algorithm (GA) for the VM placement problem that considers the migration cost in addition to the energy-consumption of the new VM placement and the total inter-VM traffic flow in the new VM placement. The GA has been implemented and evaluated by experiments, and the experimental results show that the GA outperforms two well known algorithms for the VM placement problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although live VM migration has been intensively studied, the problem of live migration of multiple interdependent VMs has hardly been investigated. The most important problem in the live migration of multiple interdependent VMs is how to schedule VM migrations as the schedule will directly affect the total migration time and the total downtime of those VMs. Aiming at minimizing both the total migration time and the total downtime simultaneously, this paper presents a Strength Pareto Evolutionary Algorithm 2 (SPEA2) for the multi-VM migration scheduling problem. The SPEA2 has been evaluated by experiments, and the experimental results show that the SPEA2 can generate a set of VM migration schedules with a shorter total migration time and a shorter total downtime than an existing genetic algorithm, namely Random Key Genetic Algorithm (RKGA). This paper also studies the scalability of the SPEA2.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The concept of big data has already outperformed traditional data management efforts in almost all industries. Other instances it has succeeded in obtaining promising results that provide value from large-scale integration and analysis of heterogeneous data sources for example Genomic and proteomic information. Big data analytics have become increasingly important in describing the data sets and analytical techniques in software applications that are so large and complex due to its significant advantages including better business decisions, cost reduction and delivery of new product and services [1]. In a similar context, the health community has experienced not only more complex and large data content, but also information systems that contain a large number of data sources with interrelated and interconnected data attributes. That have resulted in challenging, and highly dynamic environments leading to creation of big data with its enumerate complexities, for instant sharing of information with the expected security requirements of stakeholders. When comparing big data analysis with other sectors, the health sector is still in its early stages. Key challenges include accommodating the volume, velocity and variety of healthcare data with the current deluge of exponential growth. Given the complexity of big data, it is understood that while data storage and accessibility are technically manageable, the implementation of Information Accountability measures to healthcare big data might be a practical solution in support of information security, privacy and traceability measures. Transparency is one important measure that can demonstrate integrity which is a vital factor in the healthcare service. Clarity about performance expectations is considered to be another Information Accountability measure which is necessary to avoid data ambiguity and controversy about interpretation and finally, liability [2]. According to current studies [3] Electronic Health Records (EHR) are key information resources for big data analysis and is also composed of varied co-created values [3]. Common healthcare information originates from and is used by different actors and groups that facilitate understanding of the relationship for other data sources. Consequently, healthcare services often serve as an integrated service bundle. Although a critical requirement in healthcare services and analytics, it is difficult to find a comprehensive set of guidelines to adopt EHR to fulfil the big data analysis requirements. Therefore as a remedy, this research work focus on a systematic approach containing comprehensive guidelines with the accurate data that must be provided to apply and evaluate big data analysis until the necessary decision making requirements are fulfilled to improve quality of healthcare services. Hence, we believe that this approach would subsequently improve quality of life.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the ever increasing amount of eHealth data available from various eHealth systems and sources, Health Big Data Analytics promises enticing benefits such as enabling the discovery of new treatment options and improved decision making. However, concerns over the privacy of information have hindered the aggregation of this information. To address these concerns, we propose the use of Information Accountability protocols to provide patients with the ability to decide how and when their data can be shared and aggregated for use in big data research. In this paper, we discuss the issues surrounding Health Big Data Analytics and propose a consent-based model to address privacy concerns to aid in achieving the promised benefits of Big Data in eHealth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concerns over the security and privacy of patient information are one of the biggest hindrances to sharing health information and the wide adoption of eHealth systems. At present, there are competing requirements between healthcare consumers' (i.e. patients) requirements and healthcare professionals' (HCP) requirements. While consumers want control over their information, healthcare professionals want access to as much information as required in order to make well-informed decisions and provide quality care. In order to balance these requirements, the use of an Information Accountability Framework devised for eHealth systems has been proposed. In this paper, we take a step closer to the adoption of the Information Accountability protocols and demonstrate their functionality through an implementation in FluxMED, a customisable EHR system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The world has experienced a large increase in the amount of available data. Therefore, it requires better and more specialized tools for data storage and retrieval and information privacy. Recently Electronic Health Record (EHR) Systems have emerged to fulfill this need in health systems. They play an important role in medicine by granting access to information that can be used in medical diagnosis. Traditional systems have a focus on the storage and retrieval of this information, usually leaving issues related to privacy in the background. Doctors and patients may have different objectives when using an EHR system: patients try to restrict sensible information in their medical records to avoid misuse information while doctors want to see as much information as possible to ensure a correct diagnosis. One solution to this dilemma is the Accountable e-Health model, an access protocol model based in the Information Accountability Protocol. In this model patients are warned when doctors access their restricted data. They also enable a non-restrictive access for authenticated doctors. In this work we use FluxMED, an EHR system, and augment it with aspects of the Information Accountability Protocol to address these issues. The Implementation of the Information Accountability Framework (IAF) in FluxMED provides ways for both patients and physicians to have their privacy and access needs achieved. Issues related to storage and data security are secured by FluxMED, which contains mechanisms to ensure security and data integrity. The effort required to develop a platform for the management of medical information is mitigated by the FluxMED's workflow-based architecture: the system is flexible enough to allow the type and amount of information being altered without the need to change in your source code.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A key component of robotic path planning is ensuring that one can reliably navigate a vehicle to a desired location. In addition, when the features of interest are dynamic and move with oceanic currents, vehicle speed plays an important role in the planning exercise to ensure that vehicles are in the right place at the right time. Aquatic robot design is moving towards utilizing the environment for propulsion rather than traditional motors and propellers. These new vehicles are able to realize significantly increased endurance, however the mission planning problem, in turn, becomes more difficult as the vehicle velocity is not directly controllable. In this paper, we examine Gaussian process models applied to existing wave model data to predict the behavior, i.e., velocity, of a Wave Glider Autonomous Surface Vehicle. Using training data from an on-board sensor and forecasting with the WAVEWATCH III model, our probabilistic regression models created an effective method for forecasting WG velocity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, the results of the time dispersion parameters obtained from a set of channel measurements conducted in various environments that are typical of multiuser Infostation application scenarios are presented. The measurement procedure takes into account the practical scenarios typical of the positions and movements of the users in the particular Infostation network. To provide one with the knowledge of how much data can be downloaded by users over a given time and mobile speed, data transfer analysis for multiband orthogonal frequency division multiplexing (MB-OFDM) is presented. As expected, the rough estimate of simultaneous data transfer in a multiuser Infostation scenario indicates dependency of the percentage of download on the data size, number and speed of the users, and the elapse time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rolling-element bearing failures are the most frequent problems in rotating machinery, which can be catastrophic and cause major downtime. Hence, providing advance failure warning and precise fault detection in such components are pivotal and cost-effective. The vast majority of past research has focused on signal processing and spectral analysis for fault diagnostics in rotating components. In this study, a data mining approach using a machine learning technique called anomaly detection (AD) is presented. This method employs classification techniques to discriminate between defect examples. Two features, kurtosis and Non-Gaussianity Score (NGS), are extracted to develop anomaly detection algorithms. The performance of the developed algorithms was examined through real data from a test to failure bearing. Finally, the application of anomaly detection is compared with one of the popular methods called Support Vector Machine (SVM) to investigate the sensitivity and accuracy of this approach and its ability to detect the anomalies in early stages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we show implementation results of various algorithms that sort data encrypted with Fully Homomorphic Encryption scheme based on Integers. We analyze the complexities of sorting algorithms over encrypted data by considering Bubble Sort, Insertion Sort, Bitonic Sort and Odd-Even Merge sort. Our complexity analysis together with implementation results show that Odd-Even Merge Sort has better performance than the other sorting techniques. We observe that complexity of sorting in homomorphic domain will always have worst case complexity independent of the nature of input. In addition, we show that combining different sorting algorithms to sort encrypted data does not give any performance gain when compared to the application of sorting algorithms individually.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rapid advances in sequencing technologies (Next Generation Sequencing or NGS) have led to a vast increase in the quantity of bioinformatics data available, with this increasing scale presenting enormous challenges to researchers seeking to identify complex interactions. This paper is concerned with the domain of transcriptional regulation, and the use of visualisation to identify relationships between specific regulatory proteins (the transcription factors or TFs) and their associated target genes (TGs). We present preliminary work from an ongoing study which aims to determine the effectiveness of different visual representations and large scale displays in supporting discovery. Following an iterative process of implementation and evaluation, representations were tested by potential users in the bioinformatics domain to determine their efficacy, and to understand better the range of ad hoc practices among bioinformatics literate users. Results from two rounds of small scale user studies are considered with initial findings suggesting that bioinformaticians require richly detailed views of TF data, features to compare TF layouts between organisms quickly, and ways to keep track of interesting data points.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sensor networks for environmental monitoring present enormous benefits to the community and society as a whole. Currently there is a need for low cost, compact, solar powered sensors suitable for deployment in rural areas. The purpose of this research is to develop both a ground based wireless sensor network and data collection using unmanned aerial vehicles. The ground based sensor system is capable of measuring environmental data such as temperature or air quality using cost effective low power sensors. The sensor will be configured such that its data is stored on an ATMega16 microcontroller which will have the capability of communicating with a UAV flying overhead using UAV communication protocols. The data is then either sent to the ground in real time or stored on the UAV using a microcontroller until it lands or is close enough to enable the transmission of data to the ground station.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This technical report describes a Light Detection and Ranging (LiDAR) augmented optimal path planning at low level flight methodology for remote sensing and sampling Unmanned Aerial Vehicles (UAV). The UAV is used to perform remote air sampling and data acquisition from a network of sensors on the ground. The data that contains information on the terrain is in the form of a 3D point clouds maps is processed by the algorithms to find an optimal path. The results show that the method and algorithm are able to use the LiDAR data to avoid obstacles when planning a path from a start to a target point. The report compares the performance of the method as the resolution of the LIDAR map is increased and when a Digital Elevation Model (DEM) is included. From a practical point of view, the optimal path plan is loaded and works seemingly with the UAV ground station and also shows the UAV ground station software augmented with more accurate LIDAR data.