3 resultados para Distributed data
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
Centralized and Distributed methods are two connection management schemes in wavelength convertible optical networks. In the earlier work, the centralized scheme is said to have lower network blocking probability than the distributed one. Hence, much of the previous work in connection management has focused on the comparison of different algorithms in only distributed scheme or in only centralized scheme. However, we believe that the network blocking probability of these two connection management schemes depends, to a great extent, on the network traffic patterns and reservation times. Our simulation results reveal that the performance improvement (in terms of blocking probability) of centralized method over distributed method is inversely proportional to the ratio of average connection interarrival time to reservation time. After that ratio increases beyond a threshold, those two connection management schemes yield almost the same blocking probability under the same network load. In this paper, we review the working procedure of distributed and centralized schemes, discuss the tradeoff between them, compare these two methods under different network traffic patterns via simulation and give our conclusion based on the simulation data.
Resumo:
In this paper, we propose a Loss Tolerant Reliable (LTR) data transport mechanism for dynamic Event Sensing (LTRES) in WSNs. In LTRES, a reliable event sensing requirement at the transport layer is dynamically determined by the sink. A distributed source rate adaptation mechanism is designed, incorporating a loss rate based lightweight congestion control mechanism, to regulate the data traffic injected into the network so that the reliability requirement can be satisfied. An equation based fair rate control algorithm is used to improve the fairness among the LTRES flows sharing the congestion path. The performance evaluations show that LTRES can provide LTR data transport service for multiple events with short convergence time, low lost rate and high overall bandwidth utilization.
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.