62 resultados para Distributed computer systems
Resumo:
Concurrency control (CC) algorithms are important in distributed database systems to ensure consistency of the database. A number of such algorithms are available in the literature. The issue of performance evaluation of these algorithms has been recognized to be important. However, only a few studies have been carried out towards this. This paper deals with the performance evaluation of a CC algorithm proposed by Rosenkrantz et al. through a detailed simulation study. In doing so, the algorithm has been modified so that it can, within itself, take care of the redundancy in the database. The influences of various system parameters and the transaction profile on the response time and on the degree of conflict are considered. The entire study has been carried out using the programming language SIMULA on a DEC-1090 system.
Resumo:
Programming for parallel architectures that do not have a shared address space is extremely difficult due to the need for explicit communication between memories of different compute devices. A heterogeneous system with CPUs and multiple GPUs, or a distributed-memory cluster are examples of such systems. Past works that try to automate data movement for distributed-memory architectures can lead to excessive redundant communication. In this paper, we propose an automatic data movement scheme that minimizes the volume of communication between compute devices in heterogeneous and distributed-memory systems. We show that by partitioning data dependences in a particular non-trivial way, one can generate data movement code that results in the minimum volume for a vast majority of cases. The techniques are applicable to any sequence of affine loop nests and works on top of any choice of loop transformations, parallelization, and computation placement. The data movement code generated minimizes the volume of communication for a particular configuration of these. We use a combination of powerful static analyses relying on the polyhedral compiler framework and lightweight runtime routines they generate, to build a source-to-source transformation tool that automatically generates communication code. We demonstrate that the tool is scalable and leads to substantial gains in efficiency. On a heterogeneous system, the communication volume is reduced by a factor of 11X to 83X over state-of-the-art, translating into a mean execution time speedup of 1.53X. On a distributed-memory cluster, our scheme reduces the communication volume by a factor of 1.4X to 63.5X over state-of-the-art, resulting in a mean speedup of 1.55X. In addition, our scheme yields a mean speedup of 2.19X over hand-optimized UPC codes.
Resumo:
Coarse Grained Reconfigurable Architectures (CGRA) are emerging as embedded application processing units in computing platforms for Exascale computing. Such CGRAs are distributed memory multi- core compute elements on a chip that communicate over a Network-on-chip (NoC). Numerical Linear Algebra (NLA) kernels are key to several high performance computing applications. In this paper we propose a systematic methodology to obtain the specification of Compute Elements (CE) for such CGRAs. We analyze block Matrix Multiplication and block LU Decomposition algorithms in the context of a CGRA, and obtain theoretical bounds on communication requirements, and memory sizes for a CE. Support for high performance custom computations common to NLA kernels are met through custom function units (CFUs) in the CEs. We present results to justify the merits of such CFUs.
Resumo:
Multi-access techniques are widely used in computer networking and distributed multiprocessor systems. On-the-fly arbitration schemes permit one of the many contenders to access the medium without collisions. Serial arbitration is cost effective but is slow and hence unsuitable for high-speed multiprocessor environments supporting very high data transfer rates. A fully parallel arbitration scheme takes less time but is not practically realisable for large numbers of contenders. In this paper, a generalised parallel-serial scheme is proposed which significantly reduces the arbitration time and is practically realisable.
Resumo:
In this work, we evaluate the benefits of using Grids with multiple batch systems to improve the performance of multi-component and parameter sweep parallel applications by reduction in queue waiting times. Using different job traces of different loads, job distributions and queue waiting times corresponding to three different queuing policies(FCFS, conservative and EASY backfilling), we conducted a large number of experiments using simulators of two important classes of applications. The first simulator models Community Climate System Model (CCSM), a prominent multi-component application and the second simulator models parameter sweep applications. We compare the performance of the applications when executed on multiple batch systems and on a single batch system for different system and application configurations. We show that there are a large number of configurations for which application execution using multiple batch systems can give improved performance over execution on a single system.
Resumo:
The problem of scheduling divisible loads in distributed computing systems, in presence of processor release time is considered. The objective is to find the optimal sequence of load distribution and the optimal load fractions assigned to each processor in the system such that the processing time of the entire processing load is a minimum. This is a difficult combinatorial optimization problem and hence genetic algorithms approach is presented for its solution.
Resumo:
Erasure coding techniques are used to increase the reliability of distributed storage systems while minimizing storage overhead. Also of interest is minimization of the bandwidth required to repair the system following a node failure. In a recent paper, Wu et al. characterize the tradeoff between the repair bandwidth and the amount of data stored per node. They also prove the existence of regenerating codes that achieve this tradeoff. In this paper, we introduce Exact Regenerating Codes, which are regenerating codes possessing the additional property of being able to duplicate the data stored at a failed node. Such codes require low processing and communication overheads, making the system practical and easy to maintain. Explicit construction of exact regenerating codes is provided for the minimum bandwidth point on the storage-repair bandwidth tradeoff, relevant to distributed-mail-server applications. A sub-space based approach is provided and shown to yield necessary and sufficient conditions on a linear code to possess the exact regeneration property as well as prove the uniqueness of our construction. Also included in the paper, is an explicit construction of regenerating codes for the minimum storage point for parameters relevant to storage in peer-to-peer systems. This construction supports a variable number of nodes and can handle multiple, simultaneous node failures. All constructions given in the paper are of low complexity, requiring low field size in particular.
Resumo:
In this paper we report on the outcomes of a research and demonstration project on human intrusion detection in a large secure space using an ad hoc wireless sensor network. This project has been a unique experience in collaborative research, involving ten investigators (with expertise in areas such as sensors, circuits, computer systems,communication and networking, signal processing and security) to execute a large funded project that spanned three to four years. In this paper we report on the specific engineering solution that was developed: the various architectural choices and the associated specific designs. In addition to developing a demonstrable system, the various problems that arose have given rise to a large amount of basic research in areas such as geographical packet routing, distributed statistical detection, sensors and associated circuits, a low power adaptive micro-radio, and power optimising embedded systems software. We provide an overview of the research results obtained.
Resumo:
Clock synchronization is an extremely important requirement of wireless sensor networks(WSNs). There are many application scenarios such as weather monitoring and forecasting etc. where external clock synchronization may be required because WSN itself may consists of components which are not connected to each other. A usual approach for external clock synchronization in WSNs is to synchronize the clock of a reference node with an external source such as UTC, and the remaining nodes synchronize with the reference node using an internal clock synchronization protocol. In order to provide highly accurate time, both the offset and the drift rate of each clock with respect to reference node are estimated from time to time, and these are used for getting correct time from local clock reading. A problem with this approach is that it is difficult to estimate the offset of a clock with respect to the reference node when drift rate of clocks varies over a period of time. In this paper, we first propose a novel internal clock synchronization protocol based on weighted averaging technique, which synchronizes all the clocks of a WSN to a reference node periodically. We call this protocol weighted average based internal clock synchronization(WICS) protocol. Based on this protocol, we then propose our weighted average based external clock synchronization(WECS) protocol. We have analyzed the proposed protocols for maximum synchronization error and shown that it is always upper bounded. Extensive simulation studies of the proposed protocols have been carried out using Castalia simulator. Simulation results validate our theoretical claim that the maximum synchronization error is always upper bounded and also show that the proposed protocols perform better in comparison to other protocols in terms of synchronization accuracy. A prototype implementation of the proposed internal clock synchronization protocol using a few TelosB motes also validates our claim.
Resumo:
Regenerating codes are a class of codes proposed for providing reliability of data and efficient repair of failed nodes in distributed storage systems. In this paper, we address the fundamental problem of handling errors and erasures at the nodes or links, during the data-reconstruction and node-repair operations. We provide explicit regenerating codes that are resilient to errors and erasures, and show that these codes are optimal with respect to storage and bandwidth requirements. As a special case, we also establish the capacity of a class of distributed storage systems in the presence of malicious adversaries. While our code constructions are based on previously constructed Product-Matrix codes, we also provide necessary and sufficient conditions for introducing resilience in any regenerating code.
Resumo:
Euler–Bernoulli beams are distributed parameter systems that are governed by a non-linear partial differential equation (PDE) of motion. This paper presents a vibration control approach for such beams that directly utilizes the non-linear PDE of motion, and hence, it is free from approximation errors (such as model reduction, linearization etc.). Two state feedback controllers are presented based on a newly developed optimal dynamic inversion technique which leads to closed-form solutions for the control variable. In one formulation a continuous controller structure is assumed in the spatial domain, whereas in the other approach it is assumed that the control force is applied through a finite number of discrete actuators located at predefined discrete locations in the spatial domain. An implicit finite difference technique with unconditional stability has been used to solve the PDE with control actions. Numerical simulation studies show that the beam vibration can effectively be decreased using either of the two formulations.
Resumo:
Flexible constraint length channel decoders are required for software defined radios. This paper presents a novel scalable scheme for realizing flexible constraint length Viterbi decoders on a de Bruijn interconnection network. Architectures for flexible decoders using the flattened butterfly and shuffle-exchange networks are also described. It is shown that these networks provide favourable substrates for realizing flexible convolutional decoders. Synthesis results for the three networks are provided and a comparison is performed. An architecture based on a 2D-mesh, which is a topology having a nominally lesser silicon area requirement, is also considered as a fourth point for comparison. It is found that of all the networks considered, the de Bruijn network offers the best tradeoff in terms of area versus throughput.
Resumo:
An approximate dynamic programming (ADP) based neurocontroller is developed for a heat transfer application. Heat transfer problem for a fin in a car's electronic module is modeled as a nonlinear distributed parameter (infinite-dimensional) system by taking into account heat loss and generation due to conduction, convection and radiation. A low-order, finite-dimensional lumped parameter model for this problem is obtained by using Galerkin projection and basis functions designed through the 'Proper Orthogonal Decomposition' technique (POD) and the 'snap-shot' solutions. A suboptimal neurocontroller is obtained with a single-network-adaptive-critic (SNAC). Further contribution of this paper is to develop an online robust controller to account for unmodeled dynamics and parametric uncertainties. A weight update rule is presented that guarantees boundedness of the weights and eliminates the need for persistence of excitation (PE) condition to be satisfied. Since, the ADP and neural network based controllers are of fairly general structure, they appear to have the potential to be controller synthesis tools for nonlinear distributed parameter systems especially where it is difficult to obtain an accurate model.
Resumo:
Most of the structural elements like beams, cables etc. are flexible and should be modeled as distributed parameter systems (DPS) to represent the reality better. For large structures, the usual approach of 'modal representation' is not an accurate representation. Moreover, for excessive vibrations (possibly due to strong wind, earthquake etc.), external power source (controller) is needed to suppress it, as the natural damping of these structures is usually small. In this paper, we propose to use a recently developed optinial dynamic inversion technique to design a set of discrete controllers for this purpose. We assume that the control force to the structure is applied through finite number of actuators, which are located at predefined locations in the spatial domain. The method used in this paper determines control forces directly from the partial differential equation (PDE) model of the system. The formulation has better practical significance, both because it leads to a closed form solution of the controller (hence avoids computational issues) as well as because a set of discrete actuators along the spatial domain can be implemented with relative ease (as compared to a continuous actuator).
Resumo:
Massively parallel SIMD computing is applied to obtain an order of magnitude improvement in the executional speed of an important algorithm in VLSI design automation. The physical design of a VLSI circuit involves logic module placement as a subtask. The paper is concerned with accelerating the well known Min-cut placement technique for logic cell placement. The inherent parallelism of the Min-cut algorithm is identified, and it is shown that a parallel machine based on the efficient execution of the placement procedure.