163 resultados para Packet-forwarding scheme
em Indian Institute of Science - Bangalore - Índia
Resumo:
Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. With the requirement to process packets at line rates, high-performance routers need to forward millions of packets every second with each packet needing up to seven memory accesses. Earlier work shows that a single cache for the nodes of a trie can reduce the number of external memory accesses. It is observed that the locality characteristics of the level-one nodes of a trie are significantly different from those of lower level nodes. Hence, we propose a heterogeneously segmented cache architecture (HSCA) which uses separate caches for level-one and lower level nodes, each with carefully chosen sizes. Besides reducing misses, segmenting the cache allows us to focus on optimizing the more frequently accessed level-one node segment. We find that due to the nonuniform distribution of nodes among cache sets, the level-one nodes cache is susceptible t high conflict misses. We reduce conflict misses by introducing a novel two-level mapping-based cache placement framework. We also propose an elegant way to fit the modified placement function into the cache organization with minimal increase in access time. Further, we propose an attribute preserving trace generation methodology which emulates real traces and can generate traces with varying locality. Performanc results reveal that our HSCA scheme results in a 32 percent speedup in average memory access time over a unified nodes cache. Also, HSC outperforms IHARC, a cache for lookup results, with as high as a 10-fold speedup in average memory access time. Two-level mappin further enhances the performance of the base HSCA by up to 13 percent leading to an overall improvement of up to 40 percent over the unified scheme.
Resumo:
Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. The efficiency of a cache for this application critically depends on the placement function to reduce conflict misses. Traditional placement functions use a one-level mapping that naively partitions trie-nodes into cache sets. However, as a significant percentage of trie nodes are not useful, these schemes suffer from a non-uniform distribution of useful nodes to sets. This in turn results in increased conflict misses. Newer organizations such as variable associativity caches achieve flexibility in placement at the expense of increased hit-latency. This makes them unsuitable for L1 caches.We propose a novel two-level mapping framework that retains the hit-latency of one-level mapping yet incurs fewer conflict misses. This is achieved by introducing a secondlevel mapping which reorganizes the nodes in the naive initial partitions into refined partitions with near-uniform distribution of nodes. Further as this remapping is accomplished by simply adapting the index bits to a given routing table the hit-latency is not affected. We propose three new schemes which result in up to 16% reduction in the number of misses and 13% speedup in memory access time. In comparison, an XOR-based placement scheme known to perform extremely well for general purpose architectures, can obtain up to 2% speedup in memory access time.
Resumo:
In this paper, we design a new dynamic packet scheduling scheme suitable for differentiated service (DiffServ) network. Designed dynamic benefit weighted scheduling (DBWS) uses a dynamic weighted computation scheme loosely based on weighted round robin (WRR) policy. It predicts the weight required by expedited forwarding (EF) service for the current time slot (t) based on two criteria; (i) previous weight allocated to it at time (t-1), and (ii) the average increase in the queue length of EF buffer. This prediction provides smooth bandwidth allocation to all the services by avoiding overbooking of resources for EF service and still providing guaranteed services for it. The performance is analyzed for various scenarios at high, medium and low traffic conditions. The results show that packet loss is minimized, end to end delay is minimized and jitter is reduced and therefore meet quality of service (QoS) requirement of a network.
Resumo:
Previous studies have shown that buffering packets in DRAM is a performance bottleneck. In order to understand the impediments in accessing the DRAM, we developed a detailed Petri net model of IP forwarding application on IXP2400 that models the different levels of the memory hierarchy. The cell based interface used to receive and transmit packets in a network processor leads to some small size DRAM accesses. Such narrow accesses to the DRAM expose the bank access latency, reducing the bandwidth that can be realized. With real traces up to 30% of the accesses are smaller than the cell size, resulting in 7.7% reduction in DRAM bandwidth. To overcome this problem, we propose buffering these small chunks of data in the on chip scratchpad memory. This scheme also exploits greater degree of parallelism between different levels of the memory hierarchy. Using real traces from the internet, we show that the transmit rate can be improved by an average of 21% over the base scheme without the use of additional hardware. Further, the impact of different traffic patterns on the network processor resources is studied. Under real traffic conditions, we show that the data bus which connects the off-chip packet buffer to the micro-engines, is the obstacle in achieving higher throughput.
Resumo:
Network processors today consist of multiple parallel processors (micro engines) with support for multiple threads to exploit packet level parallelism inherent in network workloads. With such concurrency, packet ordering at the output of the network processor cannot be guaranteed. This paper studies the effect of concurrency in network processors on packet ordering. We use a validated Petri net model of a commercial network processor, Intel IXP 2400, to determine the extent of packet reordering for IPv4 forwarding application. Our study indicates that in addition to the parallel processing in the network processor, the allocation scheme for the transmit buffer also adversely impacts packet ordering. In particular, our results reveal that these packet reordering results in a packet retransmission rate of up to 61%. We explore different transmit buffer allocation schemes namely, contiguous, strided, local, and global which reduces the packet retransmission to 24%. We propose an alternative scheme, packet sort, which guarantees complete packet ordering while achieving a throughput of 2.5 Gbps. Further, packet sort outperforms the in-built packet ordering schemes in the IXP processor by up to 35%.
Resumo:
We consider a wireless sensor network whose main function is to detect certain infrequent alarm events, and to forward alarm packets to a base station, using geographical forwarding. The nodes know their locations, and they sleep-wake cycle, waking up periodically but not synchronously. In this situation, when a node has a packet to forward to the sink, there is a trade-off between how long this node waits for a suitable neighbor to wake up and the progress the packet makes towards the sink once it is forwarded to this neighbor. Hence, in choosing a relay node, we consider the problem of minimizing average delay subject to a constraint on the average progress. By constraint relaxation, we formulate this next hop relay selection problem as a Markov decision process (MDP). The exact optimal solution (BF (Best Forward)) can be found, but is computationally intensive. Next, we consider a mathematically simplified model for which the optimal policy (SF (Simplified Forward)) turns out to be a simple one-step-look-ahead rule. Simulations show that SF is very close in performance to BF, even for reasonably small node density. We then study the end-to-end performance of SF in comparison with two extremal policies: Max Forward (MF) and First Forward (FF), and an end-to-end delay minimising policy proposed by Kim et al. 1]. We find that, with appropriate choice of one hop average progress constraint, SF can be tuned to provide a favorable trade-off between end-to-end packet delay and the number of hops in the forwarding path.
Resumo:
A link failure in the path of a virtual circuit in a packet data network will lead to premature disconnection of the circuit by the end-points. A soft failure will result in degraded throughput over the virtual circuit. If these failures can be detected quickly and reliably, then appropriate rerouteing strategies can automatically reroute the virtual circuits that use the failed facility. In this paper, we develop a methodology for analysing and designing failure detection schemes for digital facilities. Based on errored second data, we develop a Markov model for the error and failure behaviour of a T1 trunk. The performance of a detection scheme is characterized by its false alarm probability and the detection delay. Using the Markov model, we analyse the performance of detection schemes that use physical layer or link layer information. The schemes basically rely upon detecting the occurrence of severely errored seconds (SESs). A failure is declared when a counter, that is driven by the occurrence of SESs, reaches a certain threshold.For hard failures, the design problem reduces to a proper choice;of the threshold at which failure is declared, and on the connection reattempt parameters of the virtual circuit end-point session recovery procedures. For soft failures, the performance of a detection scheme depends, in addition, on how long and how frequent the error bursts are in a given failure mode. We also propose and analyse a novel Level 2 detection scheme that relies only upon anomalies observable at Level 2, i.e. CRC failures and idle-fill flag errors. Our results suggest that Level 2 schemes that perform as well as Level 1 schemes are possible.
Resumo:
Timer-based mechanisms are often used in several wireless systems to help a given (sink) node select the best helper node among many available nodes. Specifically, a node transmits a packet when its timer expires, and the timer value is a function of its local suitability metric. In practice, the best node gets selected successfully only if no other node's timer expires within a `vulnerability' window after its timer expiry. In this paper, we provide a complete closed-form characterization of the optimal metric-to-timer mapping that maximizes the probability of success for any probability distribution function of the metric. The optimal scheme is scalable, distributed, and much better than the popular inverse metric timer mapping. We also develop an asymptotic characterization of the optimal scheme that is elegant and insightful, and accurate even for a small number of nodes.
Resumo:
Our work is motivated by geographical forwarding of sporadic alarm packets to a base station in a wireless sensor network (WSN), where the nodes are sleep-wake cycling periodically and asynchronously. We seek to develop local forwarding algorithms that can be tuned so as to tradeoff the end-to-end delay against a total cost, such as the hop count or total energy. Our approach is to solve, at each forwarding node enroute to the sink, the local forwarding problem of minimizing one-hop waiting delay subject to a lower bound constraint on a suitable reward offered by the next-hop relay; the constraint serves to tune the tradeoff. The reward metric used for the local problem is based on the end-to-end total cost objective (for instance, when the total cost is hop count, we choose to use the progress toward sink made by a relay as the reward). The forwarding node, to begin with, is uncertain about the number of relays, their wake-up times, and the reward values, but knows the probability distributions of these quantities. At each relay wake-up instant, when a relay reveals its reward value, the forwarding node's problem is to forward the packet or to wait for further relays to wake-up. In terms of the operations research literature, our work can be considered as a variant of the asset selling problem. We formulate our local forwarding problem as a partially observable Markov decision process (POMDP) and obtain inner and outer bounds for the optimal policy. Motivated by the computational complexity involved in the policies derived out of these bounds, we formulate an alternate simplified model, the optimal policy for which is a simple threshold rule. We provide simulation results to compare the performance of the inner and outer bound policies against the simple policy, and also against the optimal policy when the source knows the exact number of relays. Observing the good performance and the ease of implementation of the simple policy, we apply it to our motivating problem, i.e., local geographical routing of sporadic alarm packets in a large WSN. We compare the end-to-end performance (i.e., average total delay and average total cost) obtained by the simple policy, when used for local geographical forwarding, against that obtained by the globally optimal forwarding algorithm proposed by Kim et al. 1].
Resumo:
We study the trade-off between delivery delay and energy consumption in a delay tolerant network in which a message (or a file) has to be delivered to each of several destinations by epidemic relaying. In addition to the destinations, there are several other nodes in the network that can assist in relaying the message. We first assume that, at every instant, all the nodes know the number of relays carrying the packet and the number of destinations that have received the packet. We formulate the problem as a controlled continuous time Markov chain and derive the optimal closed loop control (i.e., forwarding policy). However, in practice, the intermittent connectivity in the network implies that the nodes may not have the required perfect knowledge of the system state. To address this issue, we obtain an ODE (i.e., fluid) approximation for the optimally controlled Markov chain. This fluid approximation also yields an asymptotically optimal open loop policy. Finally, we evaluate the performance of the deterministic policy over finite networks. Numerical results show that this policy performs close to the optimal closed loop policy.
Resumo:
We propose a Cooperative Opportunistic Automatic Repeat ReQuest (CoARQ) scheme to solve the HOL-blocking problem in infrastructure IEEE 802.11 WLANs. HOL blocking occurs when the head-of-the-line packet at the Access Point (AP) queue blocks the transmission of packets to other destinations resulting in severe throughput degradation. When the AP transmits a packet to a mobile station (STA), some of the nodes in the vicinity can overhear this packet transmission successfully. If the original transmission by the AP is unsuccessful, our CoARQ scheme chooses the station. STA or AP) with the best channel to the intended receiver as a relay and the chosen relay forwards the AP's packet to the receiver. This way, our scheme removes the bottleneck at the AP, thereby providing significant improvements in the throughput of the AP. We analyse the performance of our scheme in an infrastructure WLAN under a TCP controlled file download scenario and our analytical results are further validated by extensive simulations.
Resumo:
An explicit near-optimal guidance scheme is developed for a terminal rendezvous of a spacecraft with a passive target in circular orbit around the earth. The thrust angle versus time profile for the continuous-thrust, constant-acceleration maneuver is derived, based on the assumption that the components of inertial acceleration due to relative position and velocity are negligible on account of the close proximity between the two spacecraft. The control law is obtained as a ''bilinear tangent law'' and an analytic solution to the state differential equations is obtained by expanding a portion of the integrand as an infinite series in time. A differential corrector method is proposed, to obtain real-time updates to the guidance parameters at regular time intervals. Simulation of the guidance scheme is carried out using the Clohessy-Wiltshire equations of relative motion as well as the inverse-square two-body equations of motion. Results for typical examples are presented.
Resumo:
While frame-invariant solutions for arbitrarily large rotational deformations have been reported through the orthogonal matrix parametrization, derivation of such solutions purely through a rotation vector parametrization, which uses only three parameters and provides a parsimonious storage of rotations, is novel and constitutes the subject of this paper. In particular, we employ interpolations of relative rotations and a new rotation vector update for a strain-objective finite element formulation in the material framework. We show that the update provides either the desired rotation vector or its complement. This rules out an additive interpolation of total rotation vectors at the nodes. Hence, interpolations of relative rotation vectors are used. Through numerical examples, we show that combining the proposed update with interpolations of relative rotations yields frame-invariant and path-independent numerical solutions. Advantages of the present approach vis-a-vis the updated Lagrangian formulation are also analyzed.
Resumo:
Distributed space time coding for wireless relay networks when the source, the destination and the relays have multiple antennas have been studied by Jing and Hassibi. In this set-up, the transmit and the receive signals at different antennas of the same relay are processed and designed independently, even though the antennas are colocated. In this paper, a wireless relay network with single antenna at the source and the destination and two antennas at each of the R relays is considered. A new class of distributed space time block codes called Co-ordinate Interleaved Distributed Space-Time Codes (CIDSTC) are introduced where, in the first phase, the source transmits a T-length complex vector to all the relays;and in the second phase, at each relay, the in-phase and quadrature component vectors of the received complex vectors at the two antennas are interleaved and processed before forwarding them to the destination. Compared to the scheme proposed by Jing-Hassibi, for T >= 4R, while providing the same asymptotic diversity order of 2R, CIDSTC scheme is shown to provide asymptotic coding gain with the cost of negligible increase in the processing complexity at the relays. However, for moderate and large values of P, CIDSTC scheme is shown to provide more diversity than that of the scheme proposed by Jing-Hassibi. CIDSTCs are shown to be fully diverse provided the information symbols take value from an appropriate multidimensional signal set.
Resumo:
The so-called “Scheme of Squares”, displaying an interconnectivity of heterogeneous electron transfer and homogeneous (e.g., proton transfer) reactions, is analysed. Explicit expressions for the various partial currents under potentiostatic conditions are given. The formalism is applicable to several electrode geometries and models (e.g., semi-infinite linear diffusion, rotating disk electrodes, spherical or cylindrical systems) and the analysis is exact. The steady-state (t→∞) expressions for the current are directly given in terms of constant matrices whereas the transients are obtained as Laplace transforms that need to be inverted by approximation of numerical methods. The methodology employs a systems approach which replaces a system of partial differential equations (governing the concentrations of the several electroactive species) by an equivalent set of difference equations obeyed by the various partial currents.