20 resultados para Multi-way cluster
em Indian Institute of Science - Bangalore - Índia
Resumo:
A system of many coupled oscillators on a network can show multicluster synchronization. We obtain existence conditions and stability bounds for such a multicluster synchronization. When the oscillators are identical, we obtain the interesting result that network structure alone can cause multicluster synchronization to emerge even when all the other parameters are the same. We also study occurrence of multicluster synchronization when two different types of oscillators are coupled.
Resumo:
As computational Grids are increasingly used for executing long running multi-phase parallel applications, it is important to develop efficient rescheduling frameworks that adapt application execution in response to resource and application dynamics. In this paper, three strategies or algorithms have been developed for deciding when and where to reschedule parallel applications that execute on multi-cluster Grids. The algorithms derive rescheduling plans that consist of potential points in application execution for rescheduling and schedules of resources for application execution between two consecutive rescheduling points. Using large number of simulations, it is shown that the rescheduling plans developed by the algorithms can lead to large decrease in application execution times when compared to executions without rescheduling on dynamic Grid resources. The rescheduling plans generated by the algorithms are also shown to be competitive when compared to the near-optimal plans generated by brute-force methods. Of the algorithms, genetic algorithm yielded the most efficient rescheduling plans with 9-12% smaller average execution times than the other algorithms.
Resumo:
We consider single-source, single-sink (ss-ss) multi-hop relay networks, with slow-fading Rayleigh links. This two part paper aims at giving explicit protocols and codes to achieve the optimal diversity-multiplexing tradeoff (DMT) of two classes of multi-hop networks: K-parallel-path (KPP) networks and Layered networks. While single-antenna KPP networks were the focus of the first part, we consider layered and multi-antenna networks in this second part. We prove that a linear DMT between the maximum diversity d(max). and the maximum multiplexing gain of 1 is achievable for single-antenna fully-connected layered networks under the half-duplex constraint. This is shown to be equal to the optimal DMT if the number of relaying layers is less than 4. For the multiple-antenna case, we provide an achievable DMT, which is significantly better than known lower bounds for half duplex networks. Along the way, we compute the DMT of parallel MIMO channels in terms of the DMT of the component channel. For arbitrary ss-ss single-antenna directed acyclic networks with full-duplex relays, we prove that a linear tradeoff between maximum diversity and maximum multiplexing gain is achievable using an amplify-and-forward (AF) protocol. Explicit short-block-length codes are provided for all the proposed protocols. Two key implications of the results in the two-part paper are that the half-duplex constraint does not necessarily entail rate loss by a factor of two as previously believed and that simple AN protocols are often sufficient to attain the best possible DMT.
Resumo:
We consider single-source, single-sink multi-hop relay networks, with slow-fading Rayleigh fading links and single-antenna relay nodes operating under the half-duplex constraint. While two hop relay networks have been studied in great detail in terms of the diversity-multiplexing tradeoff (DMT), few results are available for more general networks. In this two-part paper, we identify two families of networks that are multi-hop generalizations of the two hop network: K-Parallel-Path (KPP) networks and Layered networks. In the first part, we initially consider KPP networks, which can be viewed as the union of K node-disjoint parallel paths, each of length > 1. The results are then generalized to KPP(I) networks, which permit interference between paths and to KPP(D) networks, which possess a direct link from source to sink. We characterize the optimal DMT of KPP(D) networks with K >= 4, and KPP(I) networks with K >= 3. Along the way, we derive lower bounds for the DMT of triangular channel matrices, which are useful in DMT computation of various protocols. As a special case, the DMT of two-hop relay network without direct link is obtained. Two key implications of the results in the two-part paper are that the half-duplex constraint does not necessarily entail rate loss by a factor of two, as previously believed and that, simple AF protocols are often sufficient to attain the best possible DMT.
Resumo:
In this paper we analyze a deploy and search strategy for multi-agent systems. Mobile agents equipped with sensors carry out search operation in the search space. The lack of information about the search space is modeled as an uncertainty density distribution over the space, and is assumed to be known to the agents a priori. In each step, the agents deploy themselves in an optimal way so as to maximize per step reduction in the uncertainty density. We analyze the proposed strategy for convergence and spatial distributedness. The control law moving the agents has been analyzed for stability and convergence using LaSalle's invariance principle, and for spatial distributedness under a few realistic constraints on the control input such as constant speed, limit on maximum speed, and also sensor range limits. The simulation experiments show that the strategy successfully reduces the average uncertainty density below the required level.
Resumo:
Segmental dynamic time warping (DTW) has been demonstrated to be a useful technique for finding acoustic similarity scores between segments of two speech utterances. Due to its high computational requirements, it had to be computed in an offline manner, limiting the applications of the technique. In this paper, we present results of parallelization of this task by distributing the workload in either a static or dynamic way on an 8-processor cluster and discuss the trade-offs among different distribution schemes. We show that online unsupervised pattern discovery using segmental DTW is plausible with as low as 8 processors. This brings the task within reach of today's general purpose multi-core servers. We also show results on a 32-processor system, and discuss factors affecting scalability of our methods.
Resumo:
We consider single-source single-sink (ss-ss) multi-hop relay networks, with slow-fading links and single-antenna half-duplex relay nodes. While two-hop cooperative relay networks have been studied in great detail in terms of the diversity-multiplexing tradeoff (DMT), few results are available for more general networks. In this paper, we identify two families of networks that are multi-hop generalizations of the two-hop network: K-Parallel-Path (KPP)networks and layered networks.KPP networks, can be viewed as the union of K node-disjoint parallel relaying paths, each of length greater than one. KPP networks are then generalized to KPP(I) networks, which permit interference between paths and to KPP(D) networks, which possess a direct link from source to sink. We characterize the DMT of these families of networks completely for K > 3. Layered networks are networks comprising of layers of relays with edges existing only between adjacent layers, with more than one relay in each layer. We prove that a linear DMT between the maximum diversity dmax and the maximum multiplexing gain of 1 is achievable for single-antenna fully-connected layered networks. This is shown to be equal to the optimal DMT if the number of relaying layers is less than 4.For multiple-antenna KPP and layered networks, we provide an achievable DMT, which is significantly better than known lower bounds for half duplex networks.For arbitrary multi-terminal wireless networks with multiple source-sink pairs, the maximum achievable diversity is shown to be equal to the min-cut between the corresponding source and the sink, irrespective of whether the network has half-duplex or full-duplex relays. For arbitrary ss-ss single-antenna directed acyclic networks with full-duplex relays, we prove that a linear tradeoff between maximum diversity and maximum multiplexing gain is achievable.Along the way, we derive the optimal DMT of a generalized parallel channel and derive lower bounds for the DMT of triangular channel matrices, which are useful in DMT computation of various protocols. We also give alternative and often simpler proofs of several existing results and show that codes achieving full diversity on a MIMO Rayleigh fading channel achieve full diversity on arbitrary fading channels. All protocols in this paper are explicit and use only amplify-and-forward (AF) relaying. We also construct codes with short block-lengths based on cyclic division algebras that achieve the optimal DMT for all the proposed schemes.Two key implications of the results in the paper are that the half-duplex constraint does not entail any rate loss for a large class of cooperative networks and that simple AF protocols are often sufficient to attain the optimal DMT
Resumo:
The lifetime calculation of large dense sensor networks with fixed energy resources and the remaining residual energy have shown that for a constant energy resource in a sensor network the fault rate at the cluster head is network size invariant when using the network layer with no MAC losses.Even after increasing the battery capacities in the nodes the total lifetime does not increase after a max limit of 8 times. As this is a serious limitation lots of research has been done at the MAC layer which allows to adapt to the specific connectivity, traffic and channel polling needs for sensor networks. There have been lots of MAC protocols which allow to control the channel polling of new radios which are available to sensor nodes to communicate. This further reduces the communication overhead by idling and sleep scheduling thus extending the lifetime of the monitoring application. We address the two issues which effects the distributed characteristics and performance of connected MAC nodes. (1) To determine the theoretical minimum rate based on joint coding for a correlated data source at the singlehop, (2a) to estimate cluster head errors using Bayesian rule for routing using persistence clustering when node densities are the same and stored using prior probability at the network layer, (2b) to estimate the upper bound of routing errors when using passive clustering were the node densities at the multi-hop MACS are unknown and not stored at the multi-hop nodes a priori. In this paper we evaluate many MAC based sensor network protocols and study the effects on sensor network lifetime. A renewable energy MAC routing protocol is designed when the probabilities of active nodes are not known a priori. From theoretical derivations we show that for a Bayesian rule with known class densities of omega1, omega2 with expected error P* is bounded by max error rate of P=2P* for single-hop. We study the effects of energy losses using cross-layer simulation of - large sensor network MACS setup, the error rate which effect finding sufficient node densities to have reliable multi-hop communications due to unknown node densities. The simulation results show that even though the lifetime is comparable the expected Bayesian posterior probability error bound is close or higher than Pges2P*.
Resumo:
Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences these parameters. Hence the embedded system designer performs a complete memory architecture exploration. This problem is a multi-objective optimization problem and can be tackled as a two-level optimization problem. The outer level explores various memory architecture while the inner level explores placement of data sections (data layout problem) to minimize memory stalls. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of Multi-objective Genetic Algorithm (Memory Architecture exploration) and an efficient heuristic data placement algorithm. At the outer level the memory architecture exploration is done by picking memory modules directly from a ASIC memory Library. This helps in performing the memory architecture exploration in a integrated framework, where the memory allocation, memory exploration and data layout works in a tightly coupled way to yield optimal design points with respect to area, power and performance. We experimented our approach for 3 embedded applications and our approach explores several thousand memory architecture for each application, yielding a few hundred optimal design points in a few hours of computation time on a standard desktop.
Resumo:
This paper addresses the problem of multiagent search in an unknown environment. The agents are autonomous in nature and are equipped with necessary sensors to carry out the search operation. The uncertainty, or lack of information about the search area is known a priori as a probability density function. The agents are deployed in an optimal way so as to maximize the one step uncertainty reduction. The agents continue to deploy themselves and reduce uncertainty till the uncertainty density is reduced over the search space below a minimum acceptable level. It has been shown, using LaSalle’s invariance principle, that a distributed control law which moves each of the agents towards the centroid of its Voronoi partition, modified by the sensor range leads to single step optimal deployment. This principle is now used to devise search trajectories for the agents. The simulations were carried out in 2D space with saturation on speeds of the agents. The results show that the control strategy per step indeed moves the agents to the respective centroid and the algorithm reduces the uncertainty distribution to the required level within a few steps.
Resumo:
Computational grids with multiple batch systems (batch grids) can be powerful infrastructures for executing long-running multi-component parallel applications. In this paper, we evaluate the potential improvements in throughput of long-running multi-component applications when the different components of the applications are executed on multiple batch systems of batch grids. We compare the multiple batch executions with executions of the components on a single batch system without increasing the number of processors used for executions. We perform our analysis with a foremost long-running multi-component application for climate modeling, the Community Climate System Model (CCSM). We have built a robust simulator that models the characteristics of both the multi-component application and the batch systems. By conducting large number of simulations with different workload characteristics and queuing policies of the systems, processor allocations to components of the application, distributions of the components to the batch systems and inter-cluster bandwidths, we show that multiple batch executions lead to 55% average increase in throughput over single batch executions for long-running CCSM. We also conducted real experiments with a practical middleware infrastructure and showed that multi-site executions lead to effective utilization of batch systems for executions of CCSM and give higher simulation throughput than single-site executions. Copyright (c) 2011 John Wiley & Sons, Ltd.
Resumo:
This paper presents a decentralized/peer-to-peer architecture-based parallel version of the vector evaluated particle swarm optimization (VEPSO) algorithm for multi-objective design optimization of laminated composite plates using message passing interface (MPI). The design optimization of laminated composite plates being a combinatorially explosive constrained non-linear optimization problem (CNOP), with many design variables and a vast solution space, warrants the use of non-parametric and heuristic optimization algorithms like PSO. Optimization requires minimizing both the weight and cost of these composite plates, simultaneously, which renders the problem multi-objective. Hence VEPSO, a multi-objective variant of the PSO algorithm, is used. Despite the use of such a heuristic, the application problem, being computationally intensive, suffers from long execution times due to sequential computation. Hence, a parallel version of the PSO algorithm for the problem has been developed to run on several nodes of an IBM P720 cluster. The proposed parallel algorithm, using MPI's collective communication directives, establishes a peer-to-peer relationship between the constituent parallel processes, deviating from the more common master-slave approach, in achieving reduction of computation time by factor of up to 10. Finally we show the effectiveness of the proposed parallel algorithm by comparing it with a serial implementation of VEPSO and a parallel implementation of the vector evaluated genetic algorithm (VEGA) for the same design problem. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we focus on increasing the throughput and diversity of network coded MIMO transmissions in bidirectional multi-pair wireless relay networks. All nodes have multi-antenna capability. Pairs of nodes want to exchange messages via a relay having multi-antenna and encoding/decoding capability. Nodes transmit their messages to the relay in the first (MAC) phase. The relay decodes all the messages and XORs them and broadcasts the XORed message in the second (BC) phase. We develop a generalized framework for bidirectional multi-pair multi-antenna wireless network coding, which models different MIMO transmission schemes including spatial multiplexing (V-BLAST), orthogonal STBC (OSTBC), and non-orthogonal STBC (NO-STBC) in a unified way. Enhanced throughputs are achieved by allowing all nodes to simultaneously transmit at their full rate. High diversity orders are achieved through the use of NO-STBCs, characterized by full rate and full transmit diversity. We evaluate and compare the performance of VBLAST, OSTBC, and NO-STBC schemes in one-dimensional 1-pair linear network (one pair of nodes and a relay) and two-dimensional 2-pair `cross' network (two pairs of nodes and a relay).
Resumo:
Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.
Resumo:
This study investigates the application of support vector clustering (SVC) for the direct identification of coherent synchronous generators in large interconnected multi-machine power systems. The clustering is based on coherency measure, which indicates the degree of coherency between any pair of generators. The proposed SVC algorithm processes the coherency measure matrix that is formulated using the generator rotor measurements to cluster the coherent generators. The proposed approach is demonstrated on IEEE 10 generator 39-bus system and an equivalent 35 generators, 246-bus system of practical Indian southern grid. The effect of number of data samples and fault locations are also examined for determining the accuracy of the proposed approach. An extended comparison with other clustering techniques is also included, to show the effectiveness of the proposed approach in grouping the data into coherent groups of generators. This effectiveness of the coherent clusters obtained with the proposed approach is compared in terms of a set of clustering validity indicators and in terms of statistical assessment that is based on the coherency degree of a generator pair.