983 resultados para Content Addressable Memory
Resumo:
Fast content addressable data access mechanisms have compelling applications in today's systems. Many of these exploit the powerful wildcard matching capabilities provided by ternary content addressable memories. For example, TCAM based implementations of important algorithms in data mining been developed in recent years; these achieve an an order of magnitude speedup over prevalent techniques. However, large hardware TCAMs are still prohibitively expensive in terms of power consumption and cost per bit. This has been a barrier to extending their exploitation beyond niche and special purpose systems. We propose an approach to overcome this barrier by extending the traditional virtual memory hierarchy to scale up the user visible capacity of TCAMs while mitigating the power consumption overhead. By exploiting the notion of content locality (as opposed to spatial locality), we devise a novel combination of software and hardware techniques to provide an abstraction of a large virtual ternary content addressable space. In the long run, such abstractions enable applications to disassociate considerations of spatial locality and contiguity from the way data is referenced. If successful, ideas for making content addressability a first class abstraction in computing systems can open up a radical shift in the way applications are optimized for memory locality, just as storage class memories are soon expected to shift away from the way in which applications are typically optimized for disk access locality.
Resumo:
An efficient one-step digit-set-restricted modified signed-digit (MSD) adder based on symbolic substitution is presented. In this technique, carry propagation is avoided by introducing reference digits to restrict the intermediate carry and sum digits to {1,0} and {0,1}, respectively. The proposed technique requires significantly fewer minterms and simplifies system complexity compared to the reported one-step MSD addition techniques. An incoherent correlator based on an optoelectronic shared content-addressable memory processor is suggested to perform the addition operation. In this technique, only one set of minterms needs to be stored, independent of the operand length. (C) 2002 society or Photo-Optical Instrumentation Engineers.
Resumo:
A two-step digit-set-restricted modified signed-digit (MSD) adder based on symbolic substitution is presented. In the proposed addition algorithm, carry propagation is avoided by using reference digits to restrict the intermediate MSD carry and sum digits into {(1) over bar ,0} and {0, 1}, respectively. The algorithm requires only 12 minterms to generate the final results, and no complementarity operations for nonzero outputs are involved, which simplifies the system complexity significantly. An optoelectronic shared content-addressable memory based on an incoherent correlator is used for experimental demonstration. (c) 2005 Society of Photo-Optical Instrumentation Engineers.
Resumo:
A two-step digit-set-restricted modified signed-digit (MSD) adder based on symbolic substitution is presented. In the proposed addition algorithm, carry propagation is avoided by using reference digits to restrict the intermediate MSD carry and sum digits into {(1) over bar ,0} and {0, 1}, respectively. The algorithm requires only 12 minterms to generate the final results, and no complementarity operations for nonzero outputs are involved, which simplifies the system complexity significantly. An optoelectronic shared content-addressable memory based on an incoherent correlator is used for experimental demonstration. (c) 2005 Society of Photo-Optical Instrumentation Engineers.
Resumo:
In this paper, a novel configurable content addressable memory (CCAM) cell is proposed, to increase the flexibility of embedded CAMs for SoC employment. It can be easily configured as a Binary CAM (BiCAM) or Ternary CAM (TCAM) without significant penalty of power consumption or searching speed. A 64x128 CCAM array has been built and verified through simulation. ©2007 IEEE.
Resumo:
Content Addressable Memory (CAM) is a special type of Complementary Metal-Oxide-Semiconductor (CMOS) storage element that allows for a parallel search operation on a memory stack in addition to the read and write operations yielded by a conventional SRAM storage array. In practice, it is often desirable to be able to store a “don’t care” state for faster searching operation. However, commercially available CAM chips are forced to accomplish this functionality by having to include two binary memory storage elements per CAM cell,which is a waste of precious area and power resources. This research presents a novel CAM circuit that achieves the “don’t care” functionality with a single ternary memory storage element. Using the recent development of multiple-voltage-threshold (MVT) CMOS transistors, the functionality of the proposed circuit is validated and characteristics for performance, power consumption, noise immunity, and silicon area are presented. This workpresents the following contributions to the field of CAM and ternary-valued logic:• We present a novel Simple Ternary Inverter (STI) transistor geometry scheme for achieving ternary-valued functionality in existing SOI-CMOS 0.18µm processes.• We present a novel Ternary Content Addressable Memory based on Three-Valued Logic (3CAM) as a single-storage-element CAM cell with “don’t care” functionality.• We explore the application of macro partitioning schemes to our proposed 3CAM array to observe the benefits and tradeoffs of architecture design in the context of power, delay, and area.
Resumo:
Multicasting is an efficient mechanism for one to many data dissemination. Unfortunately, IP Multicasting is not widely available to end-users today, but Application Layer Multicast (ALM), such as Content Addressable Network, helps to overcome this limitation. Our OM-QoS framework offers Quality of Service support for ALMs. We evaluated OM-QoS applied to CAN and show that we can guarantee that all multicast paths support certain QoS requirements.
Resumo:
High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have implemented a Firewall with this architecture in reconflgurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results in both speed and area improvement when it is implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields. High throughput classification invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly in terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for the worst case packet size. The Firewall rule update involves only memory re-initialization in software without any hardware change.
Resumo:
High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have Implemented a Firewall with this architecture in reconfigurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using, our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results In both speed and area Improvement when It is Implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields.High throughput classification Invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly In terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware Implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for file worst case packet size. The Firewall rule update Involves only memory re-initialiization in software without any hardware change.
Resumo:
A compact two-step modified-signed-digit arithmetic-logic array processor is proposed. When the reference digits are programmed, both addition and subtraction can be performed by the same binary logic operations regardless of the sign of the input digits. The optical implementation and experimental demonstration with an electron-trapping device are shown. Each digit is encoded by a single pixel, and no polarization is included. Any combinational logic can be easily performed without optoelectronic and electro-optic conversions of the intermediate results. The system is compact, general purpose, simple to align, and has a high signal-to-noise ratio. (C) 1999 Optical Society of America.
Resumo:
A novel, to our knowledge, two-step digit-set-restricted modified signed-digit (MSD) addition-subtraction algorithm is proposed. With the introduction of the reference digits, the operand words are mapped into an intermediate carry word with all digits restricted to the set {(1) over bar, 0} and an intermediate sum word with all digits restricted to the set {0, 1}, which can be summed to form the final result without carry generation. The operation can be performed in parallel by use of binary logic. An optical system that utilizes an electron-trapping device is suggested for accomplishing the required binary logic operations. By programming of the illumination of data arrays, any complex logic operations of multiple variables can be realized without additional temporal latency of the intermediate results. This technique has a high space-bandwidth product and signal-to-noise ratio. The main structure can be stacked to construct a compact optoelectronic MSD adder-subtracter. (C) 1999 Optical Society of America.
Resumo:
A novel optoelectronic quotient-selected modified signed-digit division technique is proposed. This division method generates one quotient digit per iteration involving only one shift operation, one quotient selection operation and one addition/subtraction operation. The quotient digit can be selected by observing three most significant digits of the partial remainder independent of the divisor. Two algorithms based on truth-table look-up and binary logic operations are derived. For optoelectronic implementation, an efficient shared content-addressable memory based architecture as well as compact logic array processor based architecture with an electron-trapping device is proposed. Performance evaluation of the proposed optoelectronic quotient-selected division shows that it is faster than the previously reported convergence division approach. Finally, proof-of-principle experimental results are presented to verify the effectiveness of the proposed technique. (C) 2001 Society of Photo-Optical Instrumentation Engineers.
Resumo:
Traditionally, the Internet provides only a “best-effort” service, treating all packets going to the same destination equally. However, providing differentiated services for different users based on their quality requirements is increasingly becoming a demanding issue. For this, routers need to have the capability to distinguish and isolate traffic belonging to different flows. This ability to determine the flow each packet belongs to is called packet classification. Technology vendors are reluctant to support algorithmic solutions for classification due to their nondeterministic performance. Although content addressable memories (CAMs) are favoured by technology vendors due to their deterministic high-lookup rates, they suffer from the problems of high-power consumption and high-silicon cost. This paper provides a new algorithmic-architectural solution for packet classification that mixes CAMs with algorithms based on multilevel cutting of the classification space into smaller spaces. The provided solution utilizes the geometrical distribution of rules in the classification space. It provides the deterministic performance of CAMs, support for dynamic updates, and added flexibility for system designers.