Biblioteca Digital

43 resultados para Interfaccia, integrata, CMOS

VLSI processor for high-performance arithmetic computations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A high performance VLSI architecture to perform combined multiply-accumulate, divide, and square root operations is proposed. The circuit is highly regular, requires only minimal control, and can be pipelined right down to the bit level. The system can also be reconfigured on every cycle to perform one or more of these operations. The throughput rate for each operation is the same and is wordlength independent. This is achieved using redundant arithmetic. With current CMOS technology, throughput rates in excess of 80 million operations per second are expected.

Veja mais

Error analysis of FFT architectures for digital video applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes how worst-case error analysis can be applied to solve some of the practical issues in the development and implementation of a low power, high performance radix-4 FFT chip for digital video applications. The chip has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-time video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W, leading to a cost-effective silicon solution for high quality video processing applications. The analysis focuses on the effect that different radix-4 architectural configurations and finite wordlengths has on the FFT output dynamic range. These issues are addressed using both mathematical error models and through extensive simulation.

Veja mais

Programmable image processing chip

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new high performance, programmable image processing chip targeted at video and HDTV applications is described. This was initially developed for image small object recognition but has much broader functional application including 1D and 2D FIR filtering as well as neural network computation. The core of the circuit is made up of an array of twenty one multiplication-accumulation cells based on systolic architecture. Devices can be cascaded to increase the order of the filter both vertically and horizontally. The chip has been fabricated in a 0.6 µ, low power CMOS technology and operates on 10 bit input data at over 54 Megasamples per second. The introduction gives some background to the chip design and highlights that there are few other comparable devices. Section 2 gives a brief introduction to small object detection. The chip architecture and the chip design will be described in detail in the later sections.

Veja mais

New FFT architecture and chip design for motion compensation based on phase correlation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Details of a new low power FFT processor for use in digital television applications are presented. This has been fabricated using a 0.6 µm CMOS technology and can perform a 64 point complex forward or inverse FFT on real-rime video at up to 18 Megasamples per second. It comprises 0.5 million transistors in a die area of 7.8×8 mm and dissipates 1 W. Its performance, in terms of computational rate per area per watt, is significantly higher than previously reported devices, leading to a cost-effective silicon solution for high quality video processing applications. This is the result of using a novel VLSI architecture which has been derived from a first principles factorisation of the DFT matrix and tailored to a direct silicon implementation.

Veja mais

Significance Driven Computation: A Voltage-Scalable, Variation-Aware, Quality-Tuning Motion Estimator

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a design methodology for algorithm/architecture co-design of a voltage-scalable, process variation aware motion estimator based on significance driven computation. The fundamental premise of our approach lies in the fact that all computations are not equally significant in shaping the output response of video systems. We use a statistical technique to intelligently identify these significant/not-so-significant computations at the algorithmic level and subsequently change the underlying architecture such that the significant computations are computed in an error free manner under voltage over-scaling. Furthermore, our design includes an adaptive quality compensation (AQC) block which "tunes" the algorithm and architecture depending on the magnitude of voltage over-scaling and severity of process variations. Simulation results show average power savings of similar to 33% for the proposed architecture when compared to conventional implementation in the 90 nm CMOS technology. The maximum output quality loss in terms of Peak Signal to Noise Ratio (PSNR) was similar to 1 dB without incurring any throughput penalty.

Veja mais

Evaluating Asymmetric Multicore Systems-on-Chip using Iso-Metrics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The end of Dennard scaling has pushed power consumption into a first order concern for current systems, on par with performance. As a result, near-threshold voltage computing (NTVC) has been proposed as a potential means to tackle the limited cooling capacity of CMOS technology. Hardware operating in NTV consumes significantly less power, at the cost of lower frequency, and thus reduced performance, as well as increased error rates. In this paper, we investigate if a low-power systems-on-chip, consisting of ARM's asymmetric big.LITTLE technology, can be an alternative to conventional high performance multicore processors in terms of power/energy in an unreliable scenario. For our study, we use the Conjugate Gradient solver, an algorithm representative of the computations performed by a large range of scientific and engineering codes.

Veja mais

Bit erasure analysis of binary adders in quantum-dot cellular automata

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As a post-CMOS technology, the incipient Quantum-dot Cellular Automata technology has various advantages. A key aspect which makes it highly desirable is low power dissipation. One method that is used to analyse power dissipation in QCA circuits is bit erasure analysis. This method has been applied to analyse previously proposed QCA binary adders. However, a number of improved QCA adders have been proposed more recently that have only been evaluated in terms of area and speed. As the three key performance metrics for QCA circuits are speed, area and power, in this paper, a bit erasure analysis of these adders will be presented to determine their power dissipation. The adders to be analysed are the Carry Flow Adder (CFA), Brent-Kung Adder (B-K), Ladner-Fischer Adder (L-F) and a more recently developed area-delay efficient adder. This research will allow for a more comprehensive comparison between the different QCA adder proposals. To the best of the authors' knowledge, this is the first time power dissipation analysis has been carried out on these adders.

Veja mais

A First Step Towards Cost Functions for Quantum-dot Cellular Automata Designs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Quantum-dot cellular automata (QCA) is potentially a very attractive alternative to CMOS for future digital designs. Circuit designs in QCA have been extensively studied. However, how to properly evaluate the QCA circuits has not been carefully considered. To date, metrics and area-delay cost functions directly mapped from CMOS technology have been used to compare QCA designs, which is inappropriate due to the differences between these two technologies. In this paper, several cost metrics specifically aimed at QCA circuits are studied. It is found that delay, the number of QCA logic gates, and the number and type of crossovers, are important metrics that should be considered when comparing QCA designs. A family of new cost functions for QCA circuits is proposed. As fundamental components in QCA computing arithmetic, QCA adders are reviewed and evaluated with the proposed cost functions. By taking the new cost metrics into account, previous best adders become unattractive and it has been shown that different optimization goals lead to different “best” adders.

Veja mais

A high performance IIR digital filter chip

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design of a high-performance IIR (infinite impulse response) digital filter is described. The chip architecture operates on 11-b parallel, two's complement input data with a 12-b parallel two's complement coefficient to produce a 14-b two's complement output. The chip is implemented in 1.5-µm, double-layer-metal CMOS technology, consumes 0.5 W, and can operate up to 15 Msample/s. The main component of the system is a fine-grained systolic array that internally is based on a signed binary number representation (SBNR). Issues addressed include testing, clock distribution, and circuitry for conversion between two's complement and SBNR.

Veja mais

Energy versus data integrity trade-offs in embedded high-density logic compatible dynamic memories

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current variation aware design methodologies, tuned for worst-case scenarios, are becoming increasingly pessimistic from the perspective of power and performance. A good example of such pessimism is setting the refresh rate of DRAMs according to the worst-case access statistics, thereby resulting in very frequent refresh cycles, which are responsible for the majority of the standby power consumption of these memories. However, such a high refresh rate may not be required, either due to extremely low probability of the actual occurrence of such a worst-case, or due to the inherent error resilient nature of many applications that can tolerate a certain number of potential failures. In this paper, we exploit and quantify the possibilities that exist in dynamic memory design by shifting to the so-called approximate computing paradigm in order to save power and enhance yield at no cost. The statistical characteristics of the retention time in dynamic memories were revealed by studying a fabricated 2kb CMOS compatible embedded DRAM (eDRAM) memory array based on gain-cells. Measurements show that up to 73% of the retention power can be saved by altering the refresh time and setting it such that a small number of failures is allowed. We show that these savings can be further increased by utilizing known circuit techniques, such as body biasing, which can help, not only in extending, but also in preferably shaping the retention time distribution. Our approach is one of the first attempts to access the data integrity and energy tradeoffs achieved in eDRAMs for utilizing them in error resilient applications and can prove helpful in the anticipated shift to approximate computing.

Veja mais

Exploiting dynamic timing margins in microprocessors for frequency-over-scaling with instruction-based clock adjustment

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Static timing analysis provides the basis for setting the clock period of a microprocessor core, based on its worst-case critical path. However, depending on the design, this critical path is not always excited and therefore dynamic timing margins exist that can theoretically be exploited for the benefit of better speed or lower power consumption (through voltage scaling). This paper introduces predictive instruction-based dynamic clock adjustment as a technique to trim dynamic timing margins in pipelined microprocessors. To this end, we exploit the different timing requirements for individual instructions during the dynamically varying program execution flow without the need for complex circuit-level measures to detect and correct timing violations. We provide a design flow to extract the dynamic timing information for the design using post-layout dynamic timing analysis and we integrate the results into a custom cycle-accurate simulator. This simulator allows annotation of individual instructions with their impact on timing (in each pipeline stage) and rapidly derives the overall code execution time for complex benchmarks. The design methodology is illustrated at the microarchitecture level, demonstrating the performance and power gains possible on a 6-stage OpenRISC in-order general purpose processor core in a 28nm CMOS technology. We show that employing instruction-dependent dynamic clock adjustment leads on average to an increase in operating speed by 38% or to a reduction in power consumption by 24%, compared to traditional synchronous clocking, which at all times has to respect the worst-case timing identified through static timing analysis.

Veja mais

Refresh-free dynamic standard-cell based memories: Application to a QC-LDPC decoder

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The area and power consumption of low-density parity check (LDPC) decoders are typically dominated by embedded memories. To alleviate such high memory costs, this paper exploits the fact that all internal memories of a LDPC decoder are frequently updated with new data. These unique memory access statistics are taken advantage of by replacing all static standard-cell based memories (SCMs) of a prior-art LDPC decoder implementation by dynamic SCMs (D-SCMs), which are designed to retain data just long enough to guarantee reliable operation. The use of D-SCMs leads to a 44% reduction in silicon area of the LDPC decoder compared to the use of static SCMs. The low-power LDPC decoder architecture with refresh-free D-SCMs was implemented in a 90nm CMOS process, and silicon measurements show full functionality and an information bit throughput of up to 600 Mbps (as required by the IEEE 802.11n standard).

Veja mais

A realist synthesis of educational interventions to improve nutrition care competencies and delivery by doctors and other healthcare professionals

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: To determine what, how, for whom, why, and in what circumstances educational interventions to improve the delivery of nutrition care by doctors and other healthcare professionals work?

Design: Realist synthesis following a published protocol and reported following Realist and Meta-narrative Evidence Synthesis: Evolving Standards (RAMESES) guidelines. A multidisciplinary team searched Medline, CINAHL, ERIC, EMBASE, PsyINFO, Sociological Abstracts, Web of Science, Google Scholar, and Science Direct for published and unpublished (grey) literature. The team identified studies with varied designs; appraised their ability to answer the review question; identified relationships between contexts, mechanisms, and outcomes (CMOs); and entered them into a spreadsheet configured for the purpose. The final synthesis identified commonalities across CMO configurations.

Results: Over half of the 46 studies from which we extracted data originated from the US. Interventions that improved the delivery of nutrition care improved skills and attitudes rather than just knowledge; provided opportunities for superiors to model nutrition care; removed barriers to nutrition care in health systems; provided participants with local, practically relevant tools and messages; and incorporated non-traditional, innovative teaching strategies. Operating in contexts where student and qualified healthcare professionals provided nutrition care in both developed and developing countries, these interventions yielded health outcomes by triggering a range of mechanisms, which included: feeling competent; feeling confident and comfortable; having greater self-efficacy; being less inhibited by barriers in healthcare systems; and feeling that nutrition care was accepted and recognised.

Conclusion: These findings show how important it is to move education for nutrition care beyond the simple acquisition of knowledge. They show how educational interventions embedded within systems of healthcare can improve patients’ health by helping health students and professionals to appreciate the importance of delivering nutrition care and feel competent to deliver it.

Veja mais

43 resultados para Interfaccia, integrata, CMOS

Filtro por publicador