990 resultados para Interconnection Network
Resumo:
This article describes the simulation and analysis of collisionless optical interconnection network, which the objective is to achieve a high performance level based on a single protocol control. The optical coupler has one shared control channel and N communication channels. Each network node two communication modules one for packet transmission/reception and another for control channel access. We show by simulation that system achieves a high performance and ensures high scalability.
Resumo:
Multi-Processor SoC (MPSOC) design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. Scaling down of process technologies has increased process and dynamic variations as well as transistor wearout. Because of this, delay variations increase and impact the performance of the MPSoCs. The interconnect architecture inMPSoCs becomes a single point of failure as it connects all other components of the system together. A faulty processing element may be shut down entirely, but the interconnect architecture must be able to tolerate partial failure and variations and operate with performance, power or latency overhead. This dissertation focuses on techniques at different levels of abstraction to face with the reliability and variability issues in on-chip interconnection networks. By showing the test results of a GALS NoC testchip this dissertation motivates the need for techniques to detect and work around manufacturing faults and process variations in MPSoCs’ interconnection infrastructure. As a physical design technique, we propose the bundle routing framework as an effective way to route the Network on Chips’ global links. For architecture-level design, two cases are addressed: (I) Intra-cluster communication where we propose a low-latency interconnect with variability robustness (ii) Inter-cluster communication where an online functional testing with a reliable NoC configuration are proposed. We also propose dualVdd as an orthogonal way of compensating variability at the post-fabrication stage. This is an alternative strategy with respect to the design techniques, since it enforces the compensation at post silicon stage.
Resumo:
Current nanometer technologies are subjected to several adverse effects that seriously impact the yield and performance of integrated circuits. Such is the case of within-die parameters uncertainties, varying workload conditions, aging, temperature, etc. Monitoring, calibration and dynamic adaptation have appeared as promising solutions to these issues and many kinds of monitors have been presented recently. In this scenario, where systems with hundreds of monitors of different types have been proposed, the need for light-weight monitoring networks has become essential. In this work we present a light-weight network architecture based on digitization resource sharing of nodes that require a time-to-digital conversion. Our proposal employs a single wire interface, shared among all the nodes in the network, and quantizes the time domain to perform the access multiplexing and transmit the information. It supposes a 16% improvement in area and power consumption compared to traditional approaches.
Resumo:
Current parallel applications running on clusters require the use of an interconnection network to perform communications among all computing nodes available. Imbalance of communications can produce network congestion, reducing throughput and increasing latency, degrading the overall system performance. On the other hand, parallel applications running on these networks posses representative stages which allow their characterization, as well as repetitive behavior that can be identified on the basis of this characterization. This work presents the Predictive and Distributed Routing Balancing (PR-DRB), a new method developed to gradually control network congestion, based on paths expansion, traffic distribution and effective traffic load, in order to maintain low latency values. PR-DRB monitors messages latencies on intermediate routers, makes decisions about alternative paths and record communication pattern information encountered during congestion situation. Based on the concept of applications repetitiveness, best solution recorded are reapplied when saved communication pattern re-appears. Traffic congestion experiments were conducted in order to evaluate the performance of the method, and improvements were observed.
Resumo:
The KCube interconnection network was first introduced in 2010 in order to exploit the good characteristics of two well-known interconnection networks, the hypercube and the Kautz graph. KCube links up multiple processors in a communication network with high density for a fixed degree. Since the KCube network is newly proposed, much study is required to demonstrate its potential properties and algorithms that can be designed to solve parallel computation problems. In this thesis we introduce a new methodology to construct the KCube graph. Also, with regard to this new approach, we will prove its Hamiltonicity in the general KC(m; k). Moreover, we will find its connectivity followed by an optimal broadcasting scheme in which a source node containing a message is to communicate it with all other processors. In addition to KCube networks, we have studied a version of the routing problem in the traditional hypercube, investigating this problem: whether there exists a shortest path in a Qn between two nodes 0n and 1n, when the network is experiencing failed components. We first conditionally discuss this problem when there is a constraint on the number of faulty nodes, and subsequently introduce an algorithm to tackle the problem without restrictions on the number of nodes.
Resumo:
This work shows the design, simulation, and analysis of two optical interconnection networks for a Dataflow parallel computer architecture. To verify the optical interconnection network performance on the Dataflow architecture, we have analyzed the load balancing among the processors during the parallel programs executions. The load balancing is a very important parameter because it is directly associated to the dataflow parallelism degree. This article proves that optical interconnection networks designed with simple optical devices can provide efficiently the dataflow requirements of a high performance communication system.
Resumo:
We present a family of networks whose local interconnection topologies are generated by the root vectors of a semi-simple complex Lie algebra. Cartan classification theorem of those algebras ensures those families of interconnection topologies to be exhaustive. The global arrangement of the network is defined in terms of integer or half-integer weight lattices. The mesh or torus topologies that network millions of processing cores, such as those in the IBM BlueGene series, are the simplest member of that category. The symmetries of the root systems of an algebra, manifested by their Weyl group, lends great convenience for the design and analysis of hardware architecture, algorithms and programs.
Resumo:
Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.
Resumo:
La computación de altas prestaciones es una área de la informática que evoluciona rápidamente, en la que actualmente aparecen nuevos computadores que llegan a los petaflops. Al principio del trabajo, se estudian los distintos tipos de redes de interconexión y los modelos de red que se utilizan para medir su latencia. El objetivo de este trabajo, es el diseño, implementación y simulación de un modelo de red de interconexión basado en enlace, que tiene en cuenta la información de topología y enrutamiento de la red de interconexión. Teniendo en cuenta que los modelos son una abstracción del sistema, en éste trabajo se hace la verificación y validación del modelo, para asegurar que éste se aproxima a lo planteado en el diseño y también que se parece al sistema que se quiere modelar.
Resumo:
The (n, k)-star interconnection network was proposed in 1995 as an attractive alternative to the n-star topology in parallel computation. The (n, k )-star has significant advantages over the n-star which itself was proposed as an attractive alternative to the popular hypercube. The major advantage of the (n, k )-star network is its scalability, which makes it more flexible than the n-star as an interconnection network. In this thesis, we will focus on finding graph theoretical properties of the (n, k )-star as well as developing parallel algorithms that run on this network. The basic topological properties of the (n, k )-star are first studied. These are useful since they can be used to develop efficient algorithms on this network. We then study the (n, k )-star network from algorithmic point of view. Specifically, we will investigate both fundamental and application algorithms for basic communication, prefix computation, and sorting, etc. A literature review of the state-of-the-art in relation to the (n, k )-star network as well as some open problems in this area are also provided.
Resumo:
The hyper-star interconnection network was proposed in 2002 to overcome the drawbacks of the hypercube and its variations concerning the network cost, which is defined by the product of the degree and the diameter. Some properties of the graph such as connectivity, symmetry properties, embedding properties have been studied by other researchers, routing and broadcasting algorithms have also been designed. This thesis studies the hyper-star graph from both the topological and algorithmic point of view. For the topological properties, we try to establish relationships between hyper-star graphs with other known graphs. We also give a formal equation for the surface area of the graph. Another topological property we are interested in is the Hamiltonicity problem of this graph. For the algorithms, we design an all-port broadcasting algorithm and a single-port neighbourhood broadcasting algorithm for the regular form of the hyper-star graphs. These algorithms are both optimal time-wise. Furthermore, we prove that the folded hyper-star, a variation of the hyper-star, to be maixmally fault-tolerant.
Resumo:
In evaluating an interconnection network, it is indispensable to estimate the size of the maximal connected components of the underlying graph when the network begins to lose processors. Hypercube is one of the most popular interconnection networks. This article addresses the maximal connected components of an n -dimensional cube with faulty processors. We first prove that an n -cube with a set F of at most 2n - 3 failing processors has a component of size greater than or equal to2(n) - \F\ - 1. We then prove that an n -cube with a set F of at most 3n - 6 missing processors has a component of size greater than or equal to2(n) - \F\ - 2.
Resumo:
Generalized honeycomb torus is a candidate for interconnection network architectures, which includes honeycomb torus, honeycomb rectangular torus, and honeycomb parallelogramic torus as special cases. Existence of Hamiltonian cycle is a basic requirement for interconnection networks since it helps map a "token ring" parallel algorithm onto the associated network in an efficient way. Cho and Hsu [Inform. Process. Lett. 86 (4) (2003) 185-190] speculated that every generalized honeycomb torus is Hamiltonian. In this paper, we have proved this conjecture. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
evaluating the fault tolerance of an interconnection network, it is essential to estimate the size of a maximal connected component of the network at the presence of faulty processors. Hypercube is one of the most popular interconnection networks. In this paper, we prove that for ngreater than or equal to6, an n-dimensional cube with a set F of at most (4n-10) failing processors has a component of size greater than or equal to2"-\F-3. This result demonstrates the superiority of hypercube in terms of the fault tolerance.
Resumo:
The locally twisted cube is a newly introduced interconnection network for parallel computing. Ring embedding is an important issue for evaluating the performance of an interconnection network. In this paper, we investigate the problem of embedding rings into a locally twisted cube. Our main contribution is to find that, for each integer l is an element of (4,5,...,2(n)}, a ring of length I can be embedded into an n-dimensional locally twisted cube so that both the dilation and the load factor are one. As a result, a locally twisted cube is Hamiltonian. We conclude that a locally twisted cube is superior to a hypercube in terms of ring embedding capability. (C) 2004 Elsevier Ltd. All rights reserved.