937 resultados para chip


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The use of bit-level systolic array circuits as building blocks in the construction of larger word-level systolic systems is investigated. It is shown that the overall structure and detailed timing of such systems may be derived quite simply using the dependence graph and cut-set procedure developed by S. Y. Kung (1988). This provides an attractive and intuitive approach to the bit-level design of many VLSI signal processing components. The technique can be applied to ripple-through and partly pipelined circuits as well as fully systolic designs. It therefore provides a means of examining the relative tradeoff between levels of pipelining, chip area, power consumption, and throughput rate within a given VLSI design.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Methods are presented for developing synthesizable FFT cores. These are based on a modular approach in which parameterized commutator and processor blocks are cascaded to implement the computations required in many important FFT signal flow graphs. In addition, it is shown how the use of a digital serial data organization can be used to produce systems that offer 100% processor utilization along with reductions in storage requirements. The approach has been used to create generators for the automated synthesis of FFT cores that are portable across a broad range of silicon technologies. Resulting chip designs are competitive with ones created using manual methods but with significant reductions in design times.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Test procedures for a pipelined bit-parallel IIR filter chip which maximally exploit its regularity are described. It is shown that small modifications to the basic architecture result in significant reductions in the number of test patterns required to test such chips. The methods used allow 100% fault coverage to be achieved using less than 1000 test vectors for a chip which has 12 bit data and coefficients.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The fabrication and performance of the first bit-level systolic correlator array is described. The application of systolic array concepts at the bit level provides a simple and extremely powerful method for implementing high-performance digital processing functions. The resulting structure is highly regular, facilitating yield enhancement through fault-tolerant redundancy techniques and therefore ideally suited to implementation as a VLSI chip. The CMOS/SOS chip operates at 35 MHz, is fully cascadable and exhibits 64-stage correlation for 1-bit reference and 4-bit data. 7 refs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A scheduling method for implementing a generic linear QR array processor architecture is presented. This improves on previous work. It also considerably simplifies the derivation of schedules for a folded linear system, where detailed account has to be taken of processor cell latency. The architecture and scheduling derived provide the basis of a generator for the rapid design of System-on-a-Chip (SoC) cores for QR decomposition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents single-chip FPGA Rijndael algorithm implementations of the Advanced Encryption Standard (AES) algorithm, Rijndael. In particular, the designs utilise look-up tables to implement the entire Rijndael Round function. A comparison is provided between these designs and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, Field Programmable Gate Arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. In this paper, a Look-Up Table (LUT) methodology is introduced where complex and slow operations are replaced by simple LUTs. A LUT-based fully pipelined Rijndael implementation is described which has a pre-placement performance of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilised to implement only one of the Round function transformations, and 6 times faster than other previous single-chip implementations. Iterative Rijndael implementations based on the Look-Up-Table design approach are also discussed and prove faster than typical iterative implementations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a new reconfigurable multi-standard architecture is introduced for integer-pixel motion estimation and a standard-cell based chip design study is presented. This has been designed to cover most of the common block-based video compression standards, including MPEG-2, MPEG-4, H.263, H.264, AVS and WMV-9. The architecture exhibits simpler control, high throughput and relative low hardware cost and highly competitive when compared with excising designs for specific video standards. It can also, through the use of control signals, be dynamically reconfigured at run-time to accommodate different system constraint such as the trade-off in power dissipation and video-quality. The computational rates achieved make the circuit suitable for high end video processing applications. Silicon design studies indicate that circuits based on this approach incur only a relatively small penalty in terms of power dissipation and silicon area when compared with implementations for specific standards.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Methods are presented for developing synthesizable FFT cores. These are based on a modular approach in which parameterizable blocks are cascaded to implement the computations required across a range of typical FFT signal flow graphs. The underlying architectural approach combines the use of a digital serial data organization with generic commutator blocks to produce systems that offer 100% processor utilization with storage requirements less than previous designs. The approach has been used to create generators for the automated synthesis of FFT cores that are portable across a broad range of silicon technologies. Resulting chip designs are competitive with manual methods but with significant reductions in design times.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes how worst-case error analysis can be applied to solve some of the practical issues in the development and implementation of a low power, high performance radix-4 FFT chip for digital video applications. The chip has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-time video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W, leading to a cost-effective silicon solution for high quality video processing applications. The analysis focuses on the effect that different radix-4 architectural configurations and finite wordlengths has on the FFT output dynamic range. These issues are addressed using both mathematical error models and through extensive simulation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Architects use cycle-by-cycle simulation to evaluate design choices and understand tradeoffs and interactions among design parameters. Efficiently exploring exponential-size design spaces with many interacting parameters remains an open problem: the sheer number of experiments renders detailed simulation intractable. We attack this problem via an automated approach that builds accurate, confident predictive design-space models. We simulate sampled points, using the results to teach our models the function describing relationships among design parameters. The models produce highly accurate performance estimates for other points in the space, can be queried to predict performance impacts of architectural changes, and are very fast compared to simulation, enabling efficient discovery of tradeoffs among parameters in different regions. We validate our approach via sensitivity studies on memory hierarchy and CPU design spaces: our models generally predict IPC with only 1-2% error and reduce required simulation by two orders of magnitude. We also show the efficacy of our technique for exploring chip multiprocessor (CMP) design spaces: when trained on a 1% sample drawn from a CMP design space with 250K points and up to 55x performance swings among different system configurations, our models predict performance with only 4-5% error on average. Our approach combines with techniques to reduce time per simulation, achieving net time savings of three-four orders of magnitude. Copyright © 2006 ACM.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We have made a comparison of (a) different surface chemistries of surface plasmon resonance (SPR) sensor chips (such as carboxymethylated dextran and carboxymethylated C1) and (b) of different assay formats (direct, sandwich and subtractive immunoassay) in order to improve the sensitivity of the determination of the model bacteria Acidovorax avenae subsp. citrulli (Aac). The use of the carboxymethylated sensor chip C1 resulted in a better sensitivity than that of carboxymethylated dextran CM5 in all the assay formats. The direct assay format, in turn, exhibits the best sensitivity. Thus, the combination of a carboxymethylated sensor chip C1 with the direct assay format resulted in the highest sensitivity for Aac, with a limit of detection of 1.6x106 CFU mL-1. This SPR immunosensor was applied to the detection of Aac in watermelon leaf extracts spiked with the bacteria, and the lower LOD is 2.2x107 CFU mL-1.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

On multiprocessors with explicitly managed memory hierarchies (EMM), software has the responsibility of moving data in and out of fast local memories. This task can be complex and error-prone even for expert programmers. Before we can allow compilers to handle the complexity for us, we must identify the abstractions that are general enough to allow us to write applications with reasonable effort, yet speci?c enough to exploit the vast on-chip memory bandwidth of EMM multi-processors. To this end, we compare two programming models against hand-tuned codes on the STI Cell, paying attention to programmability and performance. The ?rst programming model, Sequoia, abstracts the memory hierarchy as private address spaces, each corresponding to a parallel task. The second, Cellgen, is a new framework which provides OpenMP-like semantics and the abstraction of a shared address spaces divided into private and shared data. We compare three applications programmed using these models against their hand-optimized counterparts in terms of abstractions, programming complexity, and performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A novel cost-effective and low-latency wormhole router for packet-switched NoC designs, tailored for FPGA, is presented. This has been designed to be scalable at system level to fully exploit the characteristics and constraints of FPGA based systems, rather than custom ASIC technology. A key feature is that it achieves a low packet propagation latency of only two cycles per hop including both router pipeline delay and link traversal delay - a significant enhancement over existing FPGA designs - whilst being very competitive in terms of performance and hardware complexity. It can also be configured in various network topologies including 1-D, 2-D, and 3-D. Detailed design-space exploration has been carried for a range of scaling parameters, with the results of various design trade-offs being presented and discussed. By taking advantage of abundant buildin reconfigurable logic and routing resources, we have been able to create a new scalable on-chip FPGA based router that exhibits high dimensionality and connectivity. The architecture proposed can be easily migrated across many FPGA families to provide flexible, robust and cost-effective NoC solutions suitable for the implementation of high-performance FPGA computing systems. © 2011 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A binding protein displaying broad-spectrum cross-reactivity within the sulfonamide group was used in conjunction with a sulfonamide specific sensor chip and a surface plasmon resonance biosensor to develop a rapid broad spectrum screening assay for sulfonamides in porcine muscle. Results for 40 samples were available in just over 5 h after the completion of a simple sample preparation protocol. Twenty sulfonamide compounds were detected. Acetylated metabolites were not recognised by the binding protein. Limit of detection (mean-three times standard deviation value when n = 20) was calculated to be 16.9 ng g(-1) in tissue samples. Intra-assay precision (n = 10) was calculated at 4.3 %CV for a sample spiked at 50 ng g(-1) with sulfamethazine, 3.6 %CV for a sample spiked at 100 ng g(-1) with sulfamethazine, 7.2 %CV for a sample spiked at 50 ng g(-1) with sulfadiazine and 3.1 %CV for a sample spiked at 100 ng g-1 with sulfadiazine. Inter-assay precision (n = 3) was calculated at 9.7 %CV for a sample spiked at 50 ng g-1 with sulfamethazine, 3.8 %CV for a sample spiked at 100 ng g(-1) with sulfamethazine, 3.5 %CV for a sample spiked at 50 ng g(-1) with sulfadiazine and 2.8 %CV for a sample spiked at 100 ng g(-1) with sulfadiazine. (C) 2004 Elsevier B.V. All rights reserved.