43 resultados para Interfaccia, integrata, CMOS


Relevância:

10.00% 10.00%

Publicador:

Resumo:

A high performance VLSI architecture to perform combined multiply-accumulate, divide, and square root operations is proposed. The circuit is highly regular, requires only minimal control, and can be pipelined right down to the bit level. The system can also be reconfigured on every cycle to perform one or more of these operations. The throughput rate for each operation is the same and is wordlength independent. This is achieved using redundant arithmetic. With current CMOS technology, throughput rates in excess of 80 million operations per second are expected.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes how worst-case error analysis can be applied to solve some of the practical issues in the development and implementation of a low power, high performance radix-4 FFT chip for digital video applications. The chip has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-time video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W, leading to a cost-effective silicon solution for high quality video processing applications. The analysis focuses on the effect that different radix-4 architectural configurations and finite wordlengths has on the FFT output dynamic range. These issues are addressed using both mathematical error models and through extensive simulation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new high performance, programmable image processing chip targeted at video and HDTV applications is described. This was initially developed for image small object recognition but has much broader functional application including 1D and 2D FIR filtering as well as neural network computation. The core of the circuit is made up of an array of twenty one multiplication-accumulation cells based on systolic architecture. Devices can be cascaded to increase the order of the filter both vertically and horizontally. The chip has been fabricated in a 0.6 µ, low power CMOS technology and operates on 10 bit input data at over 54 Megasamples per second. The introduction gives some background to the chip design and highlights that there are few other comparable devices. Section 2 gives a brief introduction to small object detection. The chip architecture and the chip design will be described in detail in the later sections.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Details of a new low power FFT processor for use in digital television applications are presented. This has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-rime video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W. Its performance, in terms of computational rate per area per watt, is significantly higher than previously reported devices, leading to a cost-effective silicon solution for high quality video processing applications. This is the result of using a novel VLSI architecture which has been derived from a first principles factorisation of the DFT matrix and tailored to a direct silicon implementation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a design methodology for algorithm/architecture co-design of a voltage-scalable, process variation aware motion estimator based on significance driven computation. The fundamental premise of our approach lies in the fact that all computations are not equally significant in shaping the output response of video systems. We use a statistical technique to intelligently identify these significant/not-so-significant computations at the algorithmic level and subsequently change the underlying architecture such that the significant computations are computed in an error free manner under voltage over-scaling. Furthermore, our design includes an adaptive quality compensation (AQC) block which "tunes" the algorithm and architecture depending on the magnitude of voltage over-scaling and severity of process variations. Simulation results show average power savings of similar to 33% for the proposed architecture when compared to conventional implementation in the 90 nm CMOS technology. The maximum output quality loss in terms of Peak Signal to Noise Ratio (PSNR) was similar to 1 dB without incurring any throughput penalty.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The end of Dennard scaling has pushed power consumption into a first order concern for current systems, on par with performance. As a result, near-threshold voltage computing (NTVC) has been proposed as a potential means to tackle the limited cooling capacity of CMOS technology. Hardware operating in NTV consumes significantly less power, at the cost of lower frequency, and thus reduced performance, as well as increased error rates. In this paper, we investigate if a low-power systems-on-chip, consisting of ARM's asymmetric big.LITTLE technology, can be an alternative to conventional high performance multicore processors in terms of power/energy in an unreliable scenario. For our study, we use the Conjugate Gradient solver, an algorithm representative of the computations performed by a large range of scientific and engineering codes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As a post-CMOS technology, the incipient Quantum-dot Cellular Automata technology has various advantages. A key aspect which makes it highly desirable is low power dissipation. One method that is used to analyse power dissipation in QCA circuits is bit erasure analysis. This method has been applied to analyse previously proposed QCA binary adders. However, a number of improved QCA adders have been proposed more recently that have only been evaluated in terms of area and speed. As the three key performance metrics for QCA circuits are speed, area and power, in this paper, a bit erasure analysis of these adders will be presented to determine their power dissipation. The adders to be analysed are the Carry Flow Adder (CFA), Brent-Kung Adder (B-K), Ladner-Fischer Adder (L-F) and a more recently developed area-delay efficient adder. This research will allow for a more comprehensive comparison between the different QCA adder proposals. To the best of the authors' knowledge, this is the first time power dissipation analysis has been carried out on these adders.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantum-dot cellular automata (QCA) is potentially a very attractive alternative to CMOS for future digital designs. Circuit designs in QCA have been extensively studied. However, how to properly evaluate the QCA circuits has not been carefully considered. To date, metrics and area-delay cost functions directly mapped from CMOS technology have been used to compare QCA designs, which is inappropriate due to the differences between these two technologies. In this paper, several cost metrics specifically aimed at QCA circuits are studied. It is found that delay, the number of QCA logic gates, and the number and type of crossovers, are important metrics that should be considered when comparing QCA designs. A family of new cost functions for QCA circuits is proposed. As fundamental components in QCA computing arithmetic, QCA adders are reviewed and evaluated with the proposed cost functions. By taking the new cost metrics into account, previous best adders become unattractive and it has been shown that different optimization goals lead to different “best” adders.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design of a high-performance IIR (infinite impulse response) digital filter is described. The chip architecture operates on 11-b parallel, two's complement input data with a 12-b parallel two's complement coefficient to produce a 14-b two's complement output. The chip is implemented in 1.5-µm, double-layer-metal CMOS technology, consumes 0.5 W, and can operate up to 15 Msample/s. The main component of the system is a fine-grained systolic array that internally is based on a signed binary number representation (SBNR). Issues addressed include testing, clock distribution, and circuitry for conversion between two's complement and SBNR.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current variation aware design methodologies, tuned for worst-case scenarios, are becoming increasingly pessimistic from the perspective of power and performance. A good example of such pessimism is setting the refresh rate of DRAMs according to the worst-case access statistics, thereby resulting in very frequent refresh cycles, which are responsible for the majority of the standby power consumption of these memories. However, such a high refresh rate may not be required, either due to extremely low probability of the actual occurrence of such a worst-case, or due to the inherent error resilient nature of many applications that can tolerate a certain number of potential failures. In this paper, we exploit and quantify the possibilities that exist in dynamic memory design by shifting to the so-called approximate computing paradigm in order to save power and enhance yield at no cost. The statistical characteristics of the retention time in dynamic memories were revealed by studying a fabricated 2kb CMOS compatible embedded DRAM (eDRAM) memory array based on gain-cells. Measurements show that up to 73% of the retention power can be saved by altering the refresh time and setting it such that a small number of failures is allowed. We show that these savings can be further increased by utilizing known circuit techniques, such as body biasing, which can help, not only in extending, but also in preferably shaping the retention time distribution. Our approach is one of the first attempts to access the data integrity and energy tradeoffs achieved in eDRAMs for utilizing them in error resilient applications and can prove helpful in the anticipated shift to approximate computing.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Static timing analysis provides the basis for setting the clock period of a microprocessor core, based on its worst-case critical path. However, depending on the design, this critical path is not always excited and therefore dynamic timing margins exist that can theoretically be exploited for the benefit of better speed or lower power consumption (through voltage scaling). This paper introduces predictive instruction-based dynamic clock adjustment as a technique to trim dynamic timing margins in pipelined microprocessors. To this end, we exploit the different timing requirements for individual instructions during the dynamically varying program execution flow without the need for complex circuit-level measures to detect and correct timing violations. We provide a design flow to extract the dynamic timing information for the design using post-layout dynamic timing analysis and we integrate the results into a custom cycle-accurate simulator. This simulator allows annotation of individual instructions with their impact on timing (in each pipeline stage) and rapidly derives the overall code execution time for complex benchmarks. The design methodology is illustrated at the microarchitecture level, demonstrating the performance and power gains possible on a 6-stage OpenRISC in-order general purpose processor core in a 28nm CMOS technology. We show that employing instruction-dependent dynamic clock adjustment leads on average to an increase in operating speed by 38% or to a reduction in power consumption by 24%, compared to traditional synchronous clocking, which at all times has to respect the worst-case timing identified through static timing analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The area and power consumption of low-density parity check (LDPC) decoders are typically dominated by embedded memories. To alleviate such high memory costs, this paper exploits the fact that all internal memories of a LDPC decoder are frequently updated with new data. These unique memory access statistics are taken advantage of by replacing all static standard-cell based memories (SCMs) of a prior-art LDPC decoder implementation by dynamic SCMs (D-SCMs), which are designed to retain data just long enough to guarantee reliable operation. The use of D-SCMs leads to a 44% reduction in silicon area of the LDPC decoder compared to the use of static SCMs. The low-power LDPC decoder architecture with refresh-free D-SCMs was implemented in a 90nm CMOS process, and silicon measurements show full functionality and an information bit throughput of up to 600 Mbps (as required by the IEEE 802.11n standard).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: To determine what, how, for whom, why, and in what circumstances educational interventions to improve the delivery of nutrition care by doctors and other healthcare professionals work?

Design: Realist synthesis following a published protocol and reported following Realist and Meta-narrative Evidence Synthesis: Evolving Standards (RAMESES) guidelines. A multidisciplinary team searched Medline, CINAHL, ERIC, EMBASE, PsyINFO, Sociological Abstracts, Web of Science, Google Scholar, and Science Direct for published and unpublished (grey) literature. The team identified studies with varied designs; appraised their ability to answer the review question; identified relationships between contexts, mechanisms, and outcomes (CMOs); and entered them into a spreadsheet configured for the purpose. The final synthesis identified commonalities across CMO configurations.

Results: Over half of the 46 studies from which we extracted data originated from the US. Interventions that improved the delivery of nutrition care improved skills and attitudes rather than just knowledge; provided opportunities for superiors to model nutrition care; removed barriers to nutrition care in health systems; provided participants with local, practically relevant tools and messages; and incorporated non-traditional, innovative teaching strategies. Operating in contexts where student and qualified healthcare professionals provided nutrition care in both developed and developing countries, these interventions yielded health outcomes by triggering a range of mechanisms, which included: feeling competent; feeling confident and comfortable; having greater self-efficacy; being less inhibited by barriers in healthcare systems; and feeling that nutrition care was accepted and recognised.

Conclusion: These findings show how important it is to move education for nutrition care beyond the simple acquisition of knowledge. They show how educational interventions embedded within systems of healthcare can improve patients’ health by helping health students and professionals to appreciate the importance of delivering nutrition care and feel competent to deliver it.