980 resultados para communication cost


Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosion of big data, processing large numbers of continuous data streams, i.e., big data stream processing (BDSP), has become a crucial requirement for many scientific and industrial applications in recent years. By offering a pool of computation, communication and storage resources, public clouds, like Amazon's EC2, are undoubtedly the most efficient platforms to meet the ever-growing needs of BDSP. Public cloud service providers usually operate a number of geo-distributed datacenters across the globe. Different datacenter pairs are with different inter-datacenter network costs charged by Internet Service Providers (ISPs). While, inter-datacenter traffic in BDSP constitutes a large portion of a cloud provider's traffic demand over the Internet and incurs substantial communication cost, which may even become the dominant operational expenditure factor. As the datacenter resources are provided in a virtualized way, the virtual machines (VMs) for stream processing tasks can be freely deployed onto any datacenters, provided that the Service Level Agreement (SLA, e.g., quality-of-information) is obeyed. This raises the opportunity, but also a challenge, to explore the inter-datacenter network cost diversities to optimize both VM placement and load balancing towards network cost minimization with guaranteed SLA. In this paper, we first propose a general modeling framework that describes all representative inter-task relationship semantics in BDSP. Based on our novel framework, we then formulate the communication cost minimization problem for BDSP into a mixed-integer linear programming (MILP) problem and prove it to be NP-hard. We then propose a computation-efficient solution based on MILP. The high efficiency of our proposal is validated by extensive simulation based studies.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this work we introduce a new mathematical tool for optimization of routes, topology design, and energy efficiency in wireless sensor networks. We introduce a vector field formulation that models communication in the network, and routing is performed in the direction of this vector field at every location of the network. The magnitude of the vector field at every location represents the density of amount of data that is being transited through that location. We define the total communication cost in the network as the integral of a quadratic form of the vector field over the network area. With the above formulation, we introduce a mathematical machinery based on partial differential equations very similar to the Maxwell's equations in electrostatic theory. We show that in order to minimize the cost, the routes should be found based on the solution of these partial differential equations. In our formulation, the sensors are sources of information, and they are similar to the positive charges in electrostatics, the destinations are sinks of information and they are similar to negative charges, and the network is similar to a non-homogeneous dielectric media with variable dielectric constant (or permittivity coefficient). In one of the applications of our mathematical model based on the vector fields, we offer a scheme for energy efficient routing. Our routing scheme is based on changing the permittivity coefficient to a higher value in the places of the network where nodes have high residual energy, and setting it to a low value in the places of the network where the nodes do not have much energy left. Our simulations show that our method gives a significant increase in the network life compared to the shortest path and weighted shortest path schemes. Our initial focus is on the case where there is only one destination in the network, and later we extend our approach to the case where there are multiple destinations in the network. In the case of having multiple destinations, we need to partition the network into several areas known as regions of attraction of the destinations. Each destination is responsible for collecting all messages being generated in its region of attraction. The complexity of the optimization problem in this case is how to define regions of attraction for the destinations and how much communication load to assign to each destination to optimize the performance of the network. We use our vector field model to solve the optimization problem for this case. We define a vector field, which is conservative, and hence it can be written as the gradient of a scalar field (also known as a potential field). Then we show that in the optimal assignment of the communication load of the network to the destinations, the value of that potential field should be equal at the locations of all the destinations. Another application of our vector field model is to find the optimal locations of the destinations in the network. We show that the vector field gives the gradient of the cost function with respect to the locations of the destinations. Based on this fact, we suggest an algorithm to be applied during the design phase of a network to relocate the destinations for reducing the communication cost function. The performance of our proposed schemes is confirmed by several examples and simulation experiments. In another part of this work we focus on the notions of responsiveness and conformance of TCP traffic in communication networks. We introduce the notion of responsiveness for TCP aggregates and define it as the degree to which a TCP aggregate reduces its sending rate to the network as a response to packet drops. We define metrics that describe the responsiveness of TCP aggregates, and suggest two methods for determining the values of these quantities. The first method is based on a test in which we drop a few packets from the aggregate intentionally and measure the resulting rate decrease of that aggregate. This kind of test is not robust to multiple simultaneous tests performed at different routers. We make the test robust to multiple simultaneous tests by using ideas from the CDMA approach to multiple access channels in communication theory. Based on this approach, we introduce tests of responsiveness for aggregates, and call it CDMA based Aggregate Perturbation Method (CAPM). We use CAPM to perform congestion control. A distinguishing feature of our congestion control scheme is that it maintains a degree of fairness among different aggregates. In the next step we modify CAPM to offer methods for estimating the proportion of an aggregate of TCP traffic that does not conform to protocol specifications, and hence may belong to a DDoS attack. Our methods work by intentionally perturbing the aggregate by dropping a very small number of packets from it and observing the response of the aggregate. We offer two methods for conformance testing. In the first method, we apply the perturbation tests to SYN packets being sent at the start of the TCP 3-way handshake, and we use the fact that the rate of ACK packets being exchanged in the handshake should follow the rate of perturbations. In the second method, we apply the perturbation tests to the TCP data packets and use the fact that the rate of retransmitted data packets should follow the rate of perturbations. In both methods, we use signature based perturbations, which means packet drops are performed with a rate given by a function of time. We use analogy of our problem with multiple access communication to find signatures. Specifically, we assign orthogonal CDMA based signatures to different routers in a distributed implementation of our methods. As a result of orthogonality, the performance does not degrade because of cross interference made by simultaneously testing routers. We have shown efficacy of our methods through mathematical analysis and extensive simulation experiments.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Casson and Wadeson (International Journal of the Economics of Business, 1998, 5, pp. 5-27) have modelled the dialogue, or conversation, which customers have with their suppliers in order to convey their requirements, while taking production implications into account. They showed that this has important implications for the positioning of the boundaries of the firm. Unfortunately, their model has the restriction that communication is only costly in the direction of customer to supplier. This paper extends their model by introducing two-way communication costs. It shows that the level of communication cost in the direction of supplier to customer is a key additional factor in determining the nature of the dialogue that takes place. It also shows that this has important additional implications for the positioning of the boundaries of the firm. Custom computer software development is used as an example of an application of the theory.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The evolution of commodity computing lead to the possibility of efficient usage of interconnected machines to solve computationally-intensive tasks, which were previously solvable only by using expensive supercomputers. This, however, required new methods for process scheduling and distribution, considering the network latency, communication cost, heterogeneous environments and distributed computing constraints. An efficient distribution of processes over such environments requires an adequate scheduling strategy, as the cost of inefficient process allocation is unacceptably high. Therefore, a knowledge and prediction of application behavior is essential to perform effective scheduling. In this paper, we overview the evolution of scheduling approaches, focusing on distributed environments. We also evaluate the current approaches for process behavior extraction and prediction, aiming at selecting an adequate technique for online prediction of application execution. Based on this evaluation, we propose a novel model for application behavior prediction, considering chaotic properties of such behavior and the automatic detection of critical execution points. The proposed model is applied and evaluated for process scheduling in cluster and grid computing environments. The obtained results demonstrate that prediction of the process behavior is essential for efficient scheduling in large-scale and heterogeneous distributed environments, outperforming conventional scheduling policies by a factor of 10, and even more in some cases. Furthermore, the proposed approach proves to be efficient for online predictions due to its low computational cost and good precision. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Traditionally, the awarding of cash bonuses has been one of the primary tools utilized by organizational leaders to increase employee motivation. Recent research has indicated that cash awards may successfully motivate employees. The same research presents alternative, effective techniques that have been demonstrated to improve employee motivation and performance. Results of the 2010 Society for Human Resource Management survey highlight respondents' opinions regarding alternate employee motivators in the United States. The results strongly suggest that alternate cost-effective employee motivators may be as effective as cash rewards. The results of this Capstone will demonstrate that innovative, cost-effective methods can be used by organizations to retain employees. This paper will address specific areas of research including talent management, leadership, communication, and recognition.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Current technology permits connecting local networks via high-bandwidth telephone lines. Central coordinator nodes may use Intelligent Networks to manage data flow over dialed data lines, e.g. ISDN, and to establish connections between LANs. This dissertation focuses on cost minimization and on establishing operational policies for query distribution over heterogeneous, geographically distributed databases. Based on our study of query distribution strategies, public network tariff policies, and database interface standards we propose methods for communication cost estimation, strategies for the reduction of bandwidth allocation, and guidelines for central to node communication protocols. Our conclusion is that dialed data lines offer a cost effective alternative for the implementation of distributed database query systems, and that existing commercial software may be adapted to support query processing in heterogeneous distributed database systems. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The problem of decentralized sequential detection is studied in this thesis, where local sensors are memoryless, receive independent observations, and no feedback from the fusion center. In addition to traditional criteria of detection delay and error probability, we introduce a new constraint: the number of communications between local sensors and the fusion center. This metric is able to reflect both the cost of establishing communication links as well as overall energy consumption over time. A new formulation for communication-efficient decentralized sequential detection is proposed where the overall detection delay is minimized with constraints on both error probabilities and the communication cost. Two types of problems are investigated based on the communication-efficient formulation: decentralized hypothesis testing and decentralized change detection. In the former case, an asymptotically person-by-person optimum detection framework is developed, where the fusion center performs a sequential probability ratio test based on dependent observations. The proposed algorithm utilizes not only reported statistics from local sensors, but also the reporting times. The asymptotically relative efficiency of proposed algorithm with respect to the centralized strategy is expressed in closed form. When the probabilities of false alarm and missed detection are close to one another, a reduced-complexity algorithm is proposed based on a Poisson arrival approximation. In addition, decentralized change detection with a communication cost constraint is also investigated. A person-by-person optimum change detection algorithm is proposed, where transmissions of sensing reports are modeled as a Poisson process. The optimum threshold value is obtained through dynamic programming. An alternative method with a simpler fusion rule is also proposed, where the threshold values in the algorithm are determined by a combination of sequential detection analysis and constrained optimization. In both decentralized hypothesis testing and change detection problems, tradeoffs in parameter choices are investigated through Monte Carlo simulations.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Distributed Genetic Algorithms (DGAs) designed for the Internet have to take its high communication cost into consideration. For island model GAs, the migration topology has a major impact on DGA performance. This paper describes and evaluates an adaptive migration topology optimizer that keeps the communication load low while maintaining high solution quality. Experiments on benchmark problems show that the optimized topology outperforms static or random topologies of the same degree of connectivity. The applicability of the method on real-world problems is demonstrated on a hard optimization problem in VLSI design.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The ability of cloud computing to provide almost unlimited storage, backup and recovery, and quick deployment contributes to its widespread attention and implementation. Cloud computing has also become an attractive choice for mobile users as well. Due to limited features of mobile devices such as power scarcity and inability to cater computationintensive tasks, selected computation needs to be outsourced to the resourceful cloud servers. However, there are many challenges which need to be addressed in computation offloading for mobile cloud computing such as communication cost, connectivity maintenance and incurred latency. This paper presents taxonomy of the computation offloading approaches which aim to address the challenges. The taxonomy provides guidelines to identify research scopes in computation offloading for mobile cloud computing. We also outline directions and anticipated trends for future research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A secure protocol for electronic, sealed-bid, single item auctions is presented. The protocol caters to both first and second price (Vickrey) auctions and provides full price flexibility. Both computational and communication cost are linear with the number of bidders and utilize only standard cryptographic primitives. The protocol strictly divides knowledge of the bidder's identity and their actual bids between, respectively, a registration authority and an auctioneer, who are assumed not to collude but may be separately corrupt. This assures strong bidder-anonymity, though only weak bid privacy. The protocol is structured in two phases, each involving only off-line communication. Registration, requiring the use of the public key infrastructure, is simultaneous with hash-sealed bid-commitment and generates a receipt to the bidder containing a pseudonym. This phase is followed by encrypted bid-submission. Both phases involve the registration authority acting as a communication conduit but the actual message size is quite small. It is argued that this structure guarantees non-repudiation by both the winner and the auctioneer. Second price correctness is enforced either by observing the absence of registration of the claimed second-price bid or, where registered but lower than the actual second price, is subject to cooperation by the second price bidder - presumably motivated through self-interest. The use of the registration authority in other contexts is also considered with a view to developing an architecture for efficient secure multiparty transactions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of intrusion detection and location identification in the presence of clutter is considered for a hexagonal sensor-node geometry. It is noted that in any practical application,for a given fixed intruder or clutter location, only a small number of neighboring sensor nodes will register a significant reading. Thus sensing may be regarded as a local phenomenon and performance is strongly dependent on the local geometry of the sensor nodes. We focus on the case when the sensor nodes form a hexagonal lattice. The optimality of the hexagonal lattice with respect to density of packing and covering and largeness of the kissing number suggest that this is the best possible arrangement from a sensor network viewpoint. The results presented here are clearly relevant when the particular sensing application permits a deterministic placement of sensors. The results also serve as a performance benchmark for the case of a random deployment of sensors. A novel feature of our analysis of the hexagonal sensor grid is a signal-space viewpoint which sheds light on achievable performance.Under this viewpoint, the problem of intruder detection is reduced to one of determining in a distributed manner, the optimal decision boundary that separates the signal spaces SI and SC associated to intruder and clutter respectively. Given the difficulty of implementing the optimal detector, we present a low-complexity distributive algorithm under which the surfaces SI and SC are separated by a wellchosen hyperplane. The algorithm is designed to be efficient in terms of communication cost by minimizing the expected number of bits transmitted by a sensor.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper considers cooperative spectrum sensing algorithms for Cognitive Radios which focus on reducing the number of samples to make a reliable detection. We propose algorithms based on decentralized sequential hypothesis testing in which the Cognitive Radios sequentially collect the observations, make local decisions and send them to the fusion center for further processing to make a final decision on spectrum usage. The reporting channel between the Cognitive Radios and the fusion center is assumed more realistically as a Multiple Access Channel (MAC) with receiver noise. Furthermore the communication for reporting is limited, thereby reducing the communication cost. We start with an algorithm where the fusion center uses an SPRT-like (Sequential Probability Ratio Test) procedure and theoretically analyze its performance. Asymptotically, its performance is close to the optimal centralized test without fusion center noise. We further modify this algorithm to improve its performance at practical operating points. Later we generalize these algorithms to handle uncertainties in SNR and fading. (C) 2014 Elsevier B.V. All rights reserved.