176 resultados para Graphics hardware


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Field Programmable Gate Array (FPGA) based hardware accelerator for multi-conductor parasitic capacitance extraction, using Method of Moments (MoM), is presented in this paper. Due to the prohibitive cost of solving a dense algebraic system formed by MoM, linear complexity fast solver algorithms have been developed in the past to expedite the matrix-vector product computation in a Krylov sub-space based iterative solver framework. However, as the number of conductors in a system increases leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products present a time bottleneck, especially for ill-conditioned system matrices. In this work, an FPGA based hardware implementation is proposed to parallelize the iterative matrix solution for multiple RHS vectors in a low-rank compression based fast solver scheme. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple conductors in a Ball Grid Array (BGA) package. Speed-ups up to 13x over equivalent software implementation on an Intel Core i5 processor for dense matrix-vector products and 12x for QR compressed matrix-vector products is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have implemented a Firewall with this architecture in reconflgurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results in both speed and area improvement when it is implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields. High throughput classification invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly in terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for the worst case packet size. The Firewall rule update involves only memory re-initialization in software without any hardware change.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

High end network security applications demand high speed operation and large rule set support. Packet classification is the core functionality that demands high throughput in such applications. This paper proposes a packet classification architecture to meet such high throughput. We have Implemented a Firewall with this architecture in reconfigurable hardware. We propose an extension to Distributed Crossproducting of Field Labels (DCFL) technique to achieve scalable and high performance architecture. The implemented Firewall takes advantage of inherent structure and redundancy of rule set by using, our DCFL Extended (DCFLE) algorithm. The use of DCFLE algorithm results In both speed and area Improvement when It is Implemented in hardware. Although we restrict ourselves to standard 5-tuple matching, the architecture supports additional fields.High throughput classification Invariably uses Ternary Content Addressable Memory (TCAM) for prefix matching, though TCAM fares poorly In terms of area and power efficiency. Use of TCAM for port range matching is expensive, as the range to prefix conversion results in large number of prefixes leading to storage inefficiency. Extended TCAM (ETCAM) is fast and the most storage efficient solution for range matching. We present for the first time a reconfigurable hardware Implementation of ETCAM. We have implemented our Firewall as an embedded system on Virtex-II Pro FPGA based platform, running Linux with the packet classification in hardware. The Firewall was tested in real time with 1 Gbps Ethernet link and 128 sample rules. The packet classification hardware uses a quarter of logic resources and slightly over one third of memory resources of XC2VP30 FPGA. It achieves a maximum classification throughput of 50 million packet/s corresponding to 16 Gbps link rate for file worst case packet size. The Firewall rule update Involves only memory re-initialiization in software without any hardware change.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents on overview of the issues in precisely defining, specifying and evaluating the dependability of software, particularly in the context of computer controlled process systems. Dependability is intended to be a generic term embodying various quality factors and is useful for both software and hardware. While the developments in quality assurance and reliability theories have proceeded mostly in independent directions for hardware and software systems, we present here the case for developing a unified framework of dependability—a facet of operational effectiveness of modern technological systems, and develop a hierarchical systems model helpful in clarifying this view. In the second half of the paper, we survey the models and methods available for measuring and improving software reliability. The nature of software “bugs”, the failure history of the software system in the various phases of its lifecycle, the reliability growth in the development phase, estimation of the number of errors remaining in the operational phase, and the complexity of the debugging process have all been considered to varying degrees of detail. We also discuss the notion of software fault-tolerance, methods of achieving the same, and the status of other measures of software dependability such as maintainability, availability and safety.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multiprocessor systems which afford a high degree of parallelism are used in a variety of applications. The extremely stringent reliability requirement has made the provision of fault-tolerance an important aspect in the design of such systems. This paper presents a review of the various approaches towards tolerating hardware faults in multiprocessor systems. It. emphasizes the basic concepts of fault tolerant design and the various problems to be taken care of by the designer. An indepth survey of the various models, techniques and methods for fault diagnosis is given. Further, we consider the strategies for fault-tolerance in specialized multiprocessor architectures which have the ability of dynamic reconfiguration and are suited to VLSI implementation. An analysis of the state-óf-the-art is given which points out the major aspects of fault-tolerance in such architectures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Wheelchair is required for the mobility of the disabled people. It can be categorized into two categories: manual, powered wheelchair. This paper deals with series hybrid combination of manual and battery powered wheelchair. The control scheme used is simpler than other hybrid wheelchairs. It includes the sensor less control of the speed. Battery assisted wheelchair (BAW) has less number of components in its hardware. Effort made by rider is reduced considerably. The control scheme also includes the dead man's switch feature. Speed loop is provided for the smooth variation of the speed. The current limit is governed by peak current mode control.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We report the design and characterization of a circuit technique to measure the on-chip delay of an individual logic gate (both inverting and noninverting) in its unmodified form. The test circuit comprises of digitally reconfigurable ring oscillator (RO). The gate under test is embedded in each stage of the ring oscillator. A system of linear equations is then formed with different configuration settings of the RO, relating the individual gate delay to the measured period of the RO, whose solution gives the delay of the individual gates. Experimental results from a test chip in 65-nm process node show the feasibility of measuring the delay of an individual inverter to within 1 ps accuracy. Delay measurements of different nominally identicall inverters in close physical proximity show variations of up to 28% indicating the large impact of local variations. As a demonstration of this technique, we have studied delay variation with poly-pitch, length of diffusion (LOD) and different orientations of layout in silicon. The proposed technique is quite suitable for early process characterization, monitoring mature process in manufacturing and correlating model-to-hardware.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, the design and implementation of a single shared bus, shared memory multiprocessing system using Intel's single board computers is presented. The hardware configuration and the operating system developed to execute the parallel algorithms are discussed. The performance evaluation studies carried out on Image are outlined.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lateral or transaxial truncation of cone-beam data can occur either due to the field of view limitation of the scanning apparatus or iregion-of-interest tomography. In this paper, we Suggest two new methods to handle lateral truncation in helical scan CT. It is seen that reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view. A row-by-row data completion approach using linear prediction is introduced for helical scan truncated data. An extension of this technique known as windowed linear prediction approach is introduced. Efficacy of the two techniques are shown using simulation with standard phantoms. A quantitative image quality measure of the resulting reconstructed images are used to evaluate the performance of the proposed methods against an extension of a standard existing technique.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Unary operators are functions of a single variable. Realization of quaternary unary operators (QUOs) using quaternary multiplexer (QMUX) is presented in this paper. QUOs are divided into eight groups on the basis of the number of change overs in the output for an input sequence of 0, 1, 2, 3. This grouping reduces the hardware required to realize them. QMUX with two, three, and four input lines are proposed for the realization of QUOs belonging to the eight groups. A systematic procedure for the selection of QMUX and the implementation of the QUOs are given. The QMUXs are designed using CMOS ICs. The hardware required for their implementation is also discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Color displays used in image processing systems consist of a refresh memory buffer storing digital image data which are converted into analog signals to display an image by driving the primary color channels (red, green, and blue) of a color television monitor. The color cathode ray tube (CRT) of the monitor is unable to reproduce colors exactly due to phosphor limitations, exponential luminance response of the tube to the applied signal, and limitations imposed by the digital-to-analog conversion. In this paper we describe some computer simulation studies (using the U*V*W* color space) carried out to measure these reproduction errors. Further, a procedure to correct for color reproduction error due to the exponential luminance response (gamma) of the picture tube is proposed, using a video-lookup-table and a higher resolution digital-to-analog converter. It is found, on the basis of computer simulation studies, that the proposed gamma correction scheme is effective and robust with respect to variations in the assumed value of the gamma.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A microcomputer-minicomputer link, useful in the implementation of network configurations involving microcomputers and minicomputers, is described. The link, between a PDP-11 minicomputer and an 8080 microcomputer is over a serial line between the DZ11 module of the minicomputer and the UART interface of the microcomputer. The details of the essential hardware and software aspects of the link are presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The hardware and the software details of a user-friendly, simple, flexible and inexpensive pulse programmer using programmable counters interfaced to a microprocessor are described. The control of the various parameters that are required for NMR applications is implemented using the microprocessor. The basic hardware is extendable to other applications which require programmable pulse trains.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, three parallel polygon scan conversion algorithms have been proposed, and their performance when executed on a shared bus architecture has been compared. It has been shown that the parallel algorithm that does not use edge coherence performs better than those that use edge coherence. Further, a multiprocessing architecture has been proposed to execute the parallel polygon scan conversion algorithms more efficiently than a single shared bus architecture.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper describes the application of the pipelining principle to the realization of an analogue-to-ternary converter. The circuit shows a considerable saving in hard-ware compared with an earlier proposed circuit. The main hardware components used are analogue comparators, subtractors and the delay elements; hence this method of A/T conversion can operate at a higher sampling frequency.