111 resultados para Multicast
Resumo:
Peer-reviewed
Resumo:
As technology geometries have shrunk to the deep submicron regime, the communication delay and power consumption of global interconnections in high performance Multi- Processor Systems-on-Chip (MPSoCs) are becoming a major bottleneck. The Network-on- Chip (NoC) architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication issues such as performance limitations of long interconnects and integration of large number of Processing Elements (PEs) on a chip. The choice of routing protocol and NoC structure can have a significant impact on performance and power consumption in on-chip networks. In addition, building a high performance, area and energy efficient on-chip network for multicore architectures requires a novel on-chip router allowing a larger network to be integrated on a single die with reduced power consumption. On top of that, network interfaces are employed to decouple computation resources from communication resources, to provide the synchronization between them, and to achieve backward compatibility with existing IP cores. Three adaptive routing algorithms are presented as a part of this thesis. The first presented routing protocol is a congestion-aware adaptive routing algorithm for 2D mesh NoCs which does not support multicast (one-to-many) traffic while the other two protocols are adaptive routing models supporting both unicast (one-to-one) and multicast traffic. A streamlined on-chip router architecture is also presented for avoiding congested areas in 2D mesh NoCs via employing efficient input and output selection. The output selection utilizes an adaptive routing algorithm based on the congestion condition of neighboring routers while the input selection allows packets to be serviced from each input port according to its congestion level. Moreover, in order to increase memory parallelism and bring compatibility with existing IP cores in network-based multiprocessor architectures, adaptive network interface architectures are presented to use multiple SDRAMs which can be accessed simultaneously. In addition, a smart memory controller is integrated in the adaptive network interface to improve the memory utilization and reduce both memory and network latencies. Three Dimensional Integrated Circuits (3D ICs) have been emerging as a viable candidate to achieve better performance and package density as compared to traditional 2D ICs. In addition, combining the benefits of 3D IC and NoC schemes provides a significant performance gain for 3D architectures. In recent years, inter-layer communication across multiple stacked layers (vertical channel) has attracted a lot of interest. In this thesis, a novel adaptive pipeline bus structure is proposed for inter-layer communication to improve the performance by reducing the delay and complexity of traditional bus arbitration. In addition, two mesh-based topologies for 3D architectures are also introduced to mitigate the inter-layer footprint and power dissipation on each layer with a small performance penalty.
Resumo:
Through advances in technology, System-on-Chip design is moving towards integrating tens to hundreds of intellectual property blocks into a single chip. In such a many-core system, on-chip communication becomes a performance bottleneck for high performance designs. Network-on-Chip (NoC) has emerged as a viable solution for the communication challenges in highly complex chips. The NoC architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication challenges such as wiring complexity, communication latency, and bandwidth. Furthermore, the combined benefits of 3D IC and NoC schemes provide the possibility of designing a high performance system in a limited chip area. The major advantages of 3D NoCs are the considerable reductions in average latency and power consumption. There are several factors degrading the performance of NoCs. In this thesis, we investigate three main performance-limiting factors: network congestion, faults, and the lack of efficient multicast support. We address these issues by the means of routing algorithms. Congestion of data packets may lead to increased network latency and power consumption. Thus, we propose three different approaches for alleviating such congestion in the network. The first approach is based on measuring the congestion information in different regions of the network, distributing the information over the network, and utilizing this information when making a routing decision. The second approach employs a learning method to dynamically find the less congested routes according to the underlying traffic. The third approach is based on a fuzzy-logic technique to perform better routing decisions when traffic information of different routes is available. Faults affect performance significantly, as then packets should take longer paths in order to be routed around the faults, which in turn increases congestion around the faulty regions. We propose four methods to tolerate faults at the link and switch level by using only the shortest paths as long as such path exists. The unique characteristic among these methods is the toleration of faults while also maintaining the performance of NoCs. To the best of our knowledge, these algorithms are the first approaches to bypassing faults prior to reaching them while avoiding unnecessary misrouting of packets. Current implementations of multicast communication result in a significant performance loss for unicast traffic. This is due to the fact that the routing rules of multicast packets limit the adaptivity of unicast packets. We present an approach in which both unicast and multicast packets can be efficiently routed within the network. While suggesting a more efficient multicast support, the proposed approach does not affect the performance of unicast routing at all. In addition, in order to reduce the overall path length of multicast packets, we present several partitioning methods along with their analytical models for latency measurement. This approach is discussed in the context of 3D mesh networks.
Resumo:
Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.
Resumo:
Extending IPv6 to IEEE 802.15.4-based Low power Wireless Personal Area Networks requires efficient header compression mechanisms to adapt to their limited bandwidth, memory and energy constraints. This paper presents an experimental evaluation of an improved header compression scheme which provides better compression of IPv6 multicast addresses and UDP port numbers compared to existing mechanisms. This scheme outperforms the existing compression mechanism in terms of data throughput of the network and energy consumption of nodes. It enhances throughput by up to 8% and reduces transmission energy of nodes by about 5%.
Resumo:
In this report, we discuss the application of global optimization and Evolutionary Computation to distributed systems. We therefore selected and classified many publications, giving an insight into the wide variety of optimization problems which arise in distributed systems. Some interesting approaches from different areas will be discussed in greater detail with the use of illustrative examples.
Resumo:
In der vorliegenden Dissertation werden Systeme von parallel arbeitenden und miteinander kommunizierenden Restart-Automaten (engl.: systems of parallel communicating restarting automata; abgekürzt PCRA-Systeme) vorgestellt und untersucht. Dabei werden zwei bekannte Konzepte aus den Bereichen Formale Sprachen und Automatentheorie miteinander vescrknüpft: das Modell der Restart-Automaten und die sogenannten PC-Systeme (systems of parallel communicating components). Ein PCRA-System besteht aus endlich vielen Restart-Automaten, welche einerseits parallel und unabhängig voneinander lokale Berechnungen durchführen und andererseits miteinander kommunizieren dürfen. Die Kommunikation erfolgt dabei durch ein festgelegtes Kommunikationsprotokoll, das mithilfe von speziellen Kommunikationszuständen realisiert wird. Ein wesentliches Merkmal hinsichtlich der Kommunikationsstruktur in Systemen von miteinander kooperierenden Komponenten ist, ob die Kommunikation zentralisiert oder nichtzentralisiert erfolgt. Während in einer nichtzentralisierten Kommunikationsstruktur jede Komponente mit jeder anderen Komponente kommunizieren darf, findet jegliche Kommunikation innerhalb einer zentralisierten Kommunikationsstruktur ausschließlich mit einer ausgewählten Master-Komponente statt. Eines der wichtigsten Resultate dieser Arbeit zeigt, dass zentralisierte Systeme und nichtzentralisierte Systeme die gleiche Berechnungsstärke besitzen (das ist im Allgemeinen bei PC-Systemen nicht so). Darüber hinaus bewirkt auch die Verwendung von Multicast- oder Broadcast-Kommunikationsansätzen neben Punkt-zu-Punkt-Kommunikationen keine Erhöhung der Berechnungsstärke. Desweiteren wird die Ausdrucksstärke von PCRA-Systemen untersucht und mit der von PC-Systemen von endlichen Automaten und mit der von Mehrkopfautomaten verglichen. PC-Systeme von endlichen Automaten besitzen bekanntermaßen die gleiche Ausdrucksstärke wie Einwegmehrkopfautomaten und bilden eine untere Schranke für die Ausdrucksstärke von PCRA-Systemen mit Einwegkomponenten. Tatsächlich sind PCRA-Systeme auch dann stärker als PC-Systeme von endlichen Automaten, wenn die Komponenten für sich genommen die gleiche Ausdrucksstärke besitzen, also die regulären Sprachen charakterisieren. Für PCRA-Systeme mit Zweiwegekomponenten werden als untere Schranke die Sprachklassen der Zweiwegemehrkopfautomaten im deterministischen und im nichtdeterministischen Fall gezeigt, welche wiederum den bekannten Komplexitätsklassen L (deterministisch logarithmischer Platz) und NL (nichtdeterministisch logarithmischer Platz) entsprechen. Als obere Schranke wird die Klasse der kontextsensitiven Sprachen gezeigt. Außerdem werden Erweiterungen von Restart-Automaten betrachtet (nonforgetting-Eigenschaft, shrinking-Eigenschaft), welche bei einzelnen Komponenten eine Erhöhung der Berechnungsstärke bewirken, in Systemen jedoch deren Stärke nicht erhöhen. Die von PCRA-Systemen charakterisierten Sprachklassen sind unter diversen Sprachoperationen abgeschlossen und einige Sprachklassen sind sogar abstrakte Sprachfamilien (sogenannte AFL's). Abschließend werden für PCRA-Systeme spezifische Probleme auf ihre Entscheidbarkeit hin untersucht. Es wird gezeigt, dass Leerheit, Universalität, Inklusion, Gleichheit und Endlichkeit bereits für Systeme mit zwei Restart-Automaten des schwächsten Typs nicht semientscheidbar sind. Für das Wortproblem wird gezeigt, dass es im deterministischen Fall in quadratischer Zeit und im nichtdeterministischen Fall in exponentieller Zeit entscheidbar ist.
Resumo:
In previous work we proposed a multi-objective traffic engineering scheme (MHDB-S model) using different distribution trees to multicast several flows. In this paper, we propose a heuristic algorithm to create multiple point-to-multipoint (p2mp) LSPs based on the optimum sub-flow values obtained with our MHDB-S model. Moreover, a general problem for supporting multicasting in MPLS networks is the lack of labels. To reduce the number of labels used, a label space reduction algorithm solution is also considered
Resumo:
This paper presents a parallel genetic algorithm to the Steiner Problem in Networks. Several previous papers have proposed the adoption of GAs and others metaheuristics to solve the SPN demonstrating the validity of their approaches. This work differs from them for two main reasons: the dimension and the characteristics of the networks adopted in the experiments and the aim from which it has been originated. The reason that aimed this work was namely to build a comparison term for validating deterministic and computationally inexpensive algorithms which can be used in practical engineering applications, such as the multicast transmission in the Internet. On the other hand, the large dimensions of our sample networks require the adoption of a parallel implementation of the Steiner GA, which is able to deal with such large problem instances.
Resumo:
The Bloom filter is a space efficient randomized data structure for representing a set and supporting membership queries. Bloom filters intrinsically allow false positives. However, the space savings they offer outweigh the disadvantage if the false positive rates are kept sufficiently low. Inspired by the recent application of the Bloom filter in a novel multicast forwarding fabric, this paper proposes a variant of the Bloom filter, the optihash. The optihash introduces an optimization for the false positive rate at the stage of Bloom filter formation using the same amount of space at the cost of slightly more processing than the classic Bloom filter. Often Bloom filters are used in situations where a fixed amount of space is a primary constraint. We present the optihash as a good alternative to Bloom filters since the amount of space is the same and the improvements in false positives can justify the additional processing. Specifically, we show via simulations and numerical analysis that using the optihash the false positives occurrences can be reduced and controlled at a cost of small additional processing. The simulations are carried out for in-packet forwarding. In this framework, the Bloom filter is used as a compact link/route identifier and it is placed in the packet header to encode the route. At each node, the Bloom filter is queried for membership in order to make forwarding decisions. A false positive in the forwarding decision is translated into packets forwarded along an unintended outgoing link. By using the optihash, false positives can be reduced. The optimization processing is carried out in an entity termed the Topology Manger which is part of the control plane of the multicast forwarding fabric. This processing is only carried out on a per-session basis, not for every packet. The aim of this paper is to present the optihash and evaluate its false positive performances via simulations in order to measure the influence of different parameters on the false positive rate. The false positive rate for the optihash is then compared with the false positive probability of the classic Bloom filter.
Resumo:
IPTV is now offered by several operators in Europe, US and Asia using broadcast video over private IP networks that are isolated from Internet. IPTV services rely ontransmission of live (real-time) video and/or stored video. Video on Demand (VoD)and Time-shifted TV are implemented by IP unicast and Broadcast TV (BTV) and Near video on demand are implemented by IP multicast. IPTV services require QoS guarantees and can tolerate no more than 10-6 packet loss probability, 200 ms delay, and 50 ms jitter. Low delay is essential for satisfactory trick mode performance(pause, resume,fast forward) for VoD, and fast channel change time for BTV. Internet Traffic Engineering (TE) is defined in RFC 3272 and involves both capacity management and traffic management. Capacity management includes capacityplanning, routing control, and resource management. Traffic management includes (1)nodal traffic control functions such as traffic conditioning, queue management, scheduling, and (2) other functions that regulate traffic flow through the network orthat arbitrate access to network resources. An IPTV network architecture includes multiple networks (core network, metronetwork, access network and home network) that connects devices (super head-end, video hub office, video serving office, home gateway, set-top box). Each IP router in the core and metro networks implements some queueing and packet scheduling mechanism at the output link controller. Popular schedulers in IP networks include Priority Queueing (PQ), Class-Based Weighted Fair Queueing (CBWFQ), and Low Latency Queueing (LLQ) which combines PQ and CBWFQ.The thesis analyzes several Packet Scheduling algorithms that can optimize the tradeoff between system capacity and end user performance for the traffic classes. Before in the simulator FIFO,PQ,GPS queueing methods were implemented inside. This thesis aims to implement the LLQ scheduler inside the simulator and to evaluate the performance of these packet schedulers. The simulator is provided by ErnstNordström and Simulator was built in Visual C++ 2008 environmentand tested and analyzed in MatLab 7.0 under windows VISTA.
SAM: um sistema adaptativo para transmissão e recepção de sinais multimídia em redes de computadores
Resumo:
Esta Tese apresenta o SAM (Sistema Adaptativo para Multimídia), que consiste numa ferramenta de transmissão e recepção de sinais multimídia através de redes de computadores. A ferramenta pode ser utilizada para transmissões multimídia gravadas (vídeo sob demanda) ou ao vivo, como aulas a distância síncronas, shows e canais de TV. Seu maior benefício é aumentar o desempenho e a acessibilidade da transmissão, utilizando para isso um sinal codificado em camadas, onde cada camada é transmitida como um grupo multicast separado. O receptor, utilizando a ferramenta, adapta-se de acordo com a sua capacidade de rede e máquina no momento. Assim, por exemplo, um receptor com acesso via modem e outro via rede local podem assistir à transmissão na melhor qualidade possível para os mesmos. O principal foco da Tese é no algoritmo de controle de congestionamento do SAM, que foi denominado ALM (Adaptive Layered Multicast). O algoritmo ALM tem como objetivo inferir o nível de congestionamento existente na rede, determinando a quantidade de camadas que o receptor pode suportar, de forma que a quantidade de banda recebida gere um tráfego na rede que seja eqüitativo com outros tráfegos ALM concorrentes, ou outros tráfegos TCP concorrentes. Como se trata de transmissões multimídia, é desejável que a recepção do sinal seja estável, ou seja, sem muitas variações de qualidade, entretanto, o tráfego TCP concorrente é muito agressivo, dificultando a criação de um algoritmo estável. Dessa forma, desenvolveu-se dois algoritmos que formam o núcleo desta Tese: o ALMP (voltado para redes privativas), e o ALMTF (destinado a concorrer com tráfego TCP). Os elementos internos da rede, tais como os roteadores, não necessitam quaisquer modificações, e o protocolo funciona sobre a Internet atual. Para validar o método, se utilizou o software NS2 (Network Simulator), com modificações no código onde requerido. Além disso, efetuou-se uma implementação inicial para comparar os resultados das simulações com os da implementação real. Em http://www.inf.unisinos.br/~roesler/tese e também no CD anexo a esta Tese, cuja descrição encontra-se no Anexo B, podem ser encontrados todos os programas de apoio desenvolvidos para esta Tese, bem como a maior parte da bibliografia utilizada, o resultado das simulações, o código dos algoritmos no simulador e o código do algoritmo na implementação real.
Resumo:
Este trabalho relata as atividades de estudo, projeto e implementação de uma aplicação distribuída que explora mecanismos básicos empregados em comunicação de grupo. O estudo é focado no desenvolvimento e uso dos conceitos de sincronismo virtual e em resultados aplicáveis para tolerância a falhas. O objetivo deste trabalho é o de demonstrar as repercussões práticas das principais características do modelo de sincronismo virtual no suporte à tolerância a falhas. São preceitos básicos os conceitos e primitivas de sistemas distribuídos utilizando troca de mensagens, bem como as alternativas de programação embasadas no conceito de grupos. O resultado final corresponde a um sistema Cliente/Servidor, desenvolvido em Java RMI, para simular um sistema distribuído com visões de grupo atualizadas em função da ocorrência de eventos significativos na composição dos grupos (sincronismo virtual). O sistema apresenta tratamento a falhas para o colapso (crash) de processos, inclusive do servidor (coordenador do grupo), e permite a consulta a dados armazenados em diferentes servidores. Foi projetado e implementado em um ambiente Windows NT, com protocolo TCP/IP. O resultado final corresponde a um conjunto de classes que pode ser utilizado para o controle da composição de grupos (membership). O aplicativo desenvolvido neste trabalho disponibiliza seis serviços, que são: inclusão de novos membros no grupo, onde as visões de todos os membros são atualizadas já com a identificação do novo membro; envio de mensagens em multicast aos membros participantes do grupo; envio de mensagens em unicast para um membro específico do grupo; permite a saída voluntária de membros do grupo, fazendo a atualização da visão a todos os membros do grupo; monitoramento de defeitos; e visualização dos membros participantes do grupo. Um destaque deve ser dado ao tratamento da suspeita de defeito do coordenador do grupo: se o mesmo sofrer um colapso, o membro mais antigo ativo é designado como o novo coordenador, e todos os membros do grupo são atualizados sobre a situação atual quanto à coordenação do grupo.
Resumo:
VALENTIM, R. A. M. ; MORAIS, A. H. F. ; SOUZA, V. S. V ; ARAUJO JUNIOR, H. B. ; BRANDAO, G. B. ; GUERREIRO, A. M. G. . Rede de Controle em Ambiente Hospitalar: um protocolo multiciclos para automação hospitalar sobre IEEE 802.3 com IGMP Snooping. Revista Ciência e Tecnologia, v. 11, p. 19, 2009
Resumo:
This article presents the implementation of a distributed system of virtual reality, through the integration of services offered by the CORBA platform (Common Object Request Broker Architecture) and by the environment of development of 3D graphic applications in real time, the WorldToolkit, of Sense8. The developed application for the validation of this integration is that of a virtual city, with an emphasis on its traffic ways, vehicles (movable objects) and buildings (immovable objects). In this virtual world, several users can interact, each one controlling his/her own car. Since the modelling of the application took into consideration the criteria and principles of the Transport Engineering, the aim is to use it in the planning, project and construction of traffic ways for vehicles. The system was structured according to the approach client/server utilizing multicast communication among the participating nodes. The chosen implementation for the CORBA was the Iona's ORBIX software. The performance results obtained are presented and discussed in the end.