93 resultados para Low-power Design


Relevância:

90.00% 90.00%

Publicador:

Resumo:

A major concern of embedded system architects is the design for low power. We address one aspect of the problem in this paper, namely the effect of executable code compression. There are two benefits of code compression – firstly, a reduction in the memory footprint of embedded software, and secondly, potential reduction in memory bus traffic and power consumption. Since decompression has to be performed at run time it is achieved by hardware. We describe a tool called COMPASS which can evaluate a range of strategies for any given set of benchmarks and display compression ratios. Also, given an execution trace, it can compute the effect on bus toggles, and cache misses for a range of compression strategies. The tool is interactive and allows the user to vary a set of parameters, and observe their effect on performance. We describe an implementation of the tool and demonstrate its effectiveness. To the best of our knowledge this is the first tool proposed for such a purpose.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Frequency multiplication (FM) can be used to design low power frequency synthesizers. This is achieved by running the VCO at a much reduced frequency, while employing a power efficient frequency multiplier, and also thereby eliminating the first few dividers. Quadrature signals can be generated by frequency- multiplying low frequency I/Q signals, however this also multiplies the quadrature error of these signals. Another way is generating additional edges from the low-frequency oscillator (LFO) and develop a quadrature FM. This makes the I-Q precision heavily dependent on process mismatches in the ring oscillator. In this paper we examine the use of fewer edges from LFO and a single stage polyphase filter to generate approximate quadrature signals, which is then followed by an injection-locked quadrature VCO to generate high- precision I/Q signals. Simulation comparisons with the existing approach shows that the proposed method offers very good phase accuracy of 0.5deg with only a modest increase in power dissipation for 2.4 GHz IEEE 802.15.4 standard using UMC 0.13 mum RFCMOS technology.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

REDEFINE is a reconfigurable SoC architecture that provides a unique platform for high performance and low power computing by exploiting the synergistic interaction between coarse grain dynamic dataflow model of computation (to expose abundant parallelism in applications) and runtime composition of efficient compute structures (on the reconfigurable computation resources). We propose and study the throttling of execution in REDEFINE to maximize the architecture efficiency. A feature specific fast hybrid (mixed level) simulation framework for early in design phase study is developed and implemented to make the huge design space exploration practical. We do performance modeling in terms of selection of important performance criteria, ranking of the explored throttling schemes and investigate effectiveness of the design space exploration using statistical hypothesis testing. We find throttling schemes which give appreciable (24.8%) overall performance gain in the architecture and 37% resource usage gain in the throttling unit simultaneously.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Conventional Random access scan (RAS) for testing has lower test application time, low power dissipation, and low test data volume compared to standard serial scan chain based design In this paper, we present two cluster based techniques, namely, Serial Input Random Access Scan and Variable Word Length Random Access Scan to reduce test application time even further by exploiting the parallelism among the clusters and performing write operations on multiple bits Experimental results on benchmarks circuits show on an average 2-3 times speed up in test write time and average 60% reduction in write test data volume

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Biomedical engineering solutions like surgical simulators need High Performance Computing (HPC) to achieve real-time performance. Graphics Processing Units (GPUs) offer HPC capabilities at low cost and low power consumption. In this work, it is demonstrated that a liver which is discretized by about 2500 finite element nodes, can be graphically simulated in realtime, by making use of a GPU. Present work takes into consideration the time needed for the data transfer from CPU to GPU and back from GPU to CPU. Although behaviour of liver is very complicated, present computer simulation assumes linear elastostatics. One needs to use the commercial software ANSYS to obtain the global stiffness matrix of the liver. Results show that GPUs are useful for the real-time graphical simulation of liver, which in turn is needed in simulators that are used for training surgeons in laparoscopic surgery. Although the computer simulation should involve rendering also, neither rendering, nor the time needed for rendering and displaying the liver on a screen, is considered in the present work. The present work is just a demonstration of a concept; the concept is not really implemented and validated. Future work is to develop software which can accomplish real-time and very realistic graphical simulation of liver, with rendered image of liver on the screen changing in real-time according to the position of the surgical tool tip approximated as the mouse cursor in 3D.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Chronic recording of neural signals is indispensable in designing efficient brain–machine interfaces and to elucidate human neurophysiology. The advent of multichannel micro-electrode arrays has driven the need for electronics to record neural signals from many neurons. The dynamic range of the system can vary over time due to change in electrode–neuron distance and background noise. We propose a neural amplifier in UMC 130 nm, 1P8M complementary metal–oxide–semiconductor (CMOS) technology. It can be biased adaptively from 200 nA to 2 $mu{rm A}$, modulating input referred noise from 9.92 $mu{rm V}$ to 3.9 $mu{rm V}$. We also describe a low noise design technique which minimizes the noise contribution of the load circuitry. Optimum sizing of the input transistors minimizes the accentuation of the input referred noise of the amplifier and obviates the need of large input capacitance. The amplifier achieves a noise efficiency factor of 2.58. The amplifier can pass signal from 5 Hz to 7 kHz and the bandwidth of the amplifier can be tuned for rejecting low field potentials (LFP) and power line interference. The amplifier achieves a mid-band voltage gain of 37 dB. In vitro experiments are performed to validate the applicability of the neural low noise amplifier in neural recording systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

High voltage power supplies for radar applications are investigated which are subjected to pulsed load with stringent specifications. In the proposed solution, power conversion is done in two stages. A low power-high frequency converter modulates the input voltage of a high power-low frequency converter. This method satisfies all the performance specifications and takes care of the critical aspects of HV transformer.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Instruction reuse is a microarchitectural technique that improves the execution time of a program by removing redundant computations at run-time. Although this is the job of an optimizing compiler, they do not succeed many a time due to limited knowledge of run-time data. In this paper we examine instruction reuse of integer ALU and load instructions in network processing applications. Specifically, this paper attempts to answer the following questions: (1) How much of instruction reuse is inherent in network processing applications?, (2) Can reuse be improved by reducing interference in the reuse buffer?, (3) What characteristics of network applications can be exploited to improve reuse?, and (4) What is the effect of reuse on resource contention and memory accesses? We propose an aggregation scheme that combines the high-level concept of network traffic i.e. "flows" with a low level microarchitectural feature of programs i.e. repetition of instructions and data along with an architecture that exploits temporal locality in incoming packet data to improve reuse. We find that for the benchmarks considered, 1% to 50% of instructions are reused while the speedup achieved varies between 1% and 24%. As a side effect, instruction reuse reduces memory traffic and can therefore be considered as a scheme for low power.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Clustered architecture processors are preferred for embedded systems because centralized register file architectures scale poorly in terms of clock rate, chip area, and power consumption. Although clustering helps by improving the clock speed, reducing the energy consumption of the logic, and making the design simpler, it introduces extra overheads by way of inter-cluster communication. This communication happens over long global wires having high load capacitance which leads to delay in execution and significantly high energy consumption. Inter-cluster communication also introduces many short idle cycles, thereby significantly increasing the overall leakage energy consumption in the functional units. The trend towards miniaturization of devices (and associated reduction in threshold voltage) makes energy consumption in interconnects and functional units even worse, and limits the usability of clustered architectures in smaller technologies. However, technological advancements now permit the design of interconnects and functional units with varying performance and power modes. In this paper, we propose scheduling algorithms that aggregate the scheduling slack of instructions and communication slack of data values to exploit the low-power modes of functional units and interconnects. Finally, we present a synergistic combination of these algorithms that simultaneously saves energy in functional units and interconnects to improves the usability of clustered architectures by achieving better overall energy-performance trade-offs. Even with conservative estimates of the contribution of the functional units and interconnects to the overall processor energy consumption, the proposed combined scheme obtains on average 8% and 10% improvement in overall energy-delay product with 3.5% and 2% performance degradation for a 2-clustered and a 4-clustered machine, respectively. We present a detailed experimental evaluation of the proposed schemes. Our test bed uses the Trimaran compiler infrastructure. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A circuit topology based on accumulate-and-use philosophy has been developed to harvest RF energy from ambient radiations such as those from cellular towers. Main functional units of this system are antenna, tuned rectifier, supercapacitor, a gated boost converter and the necessary power management circuits. Various RF aspects of the design philosophy for maximizing the conversion efficiency at an input power level of 15 mu W are presented here. The system is characterized in an anechoic chamber and it has been established that this topology can harvest RF power densities as low as 180 mu W/m(2) and can adaptively operate the load depending on the incident radiation levels. The output of this system can be easily configured at a desired voltage in the range 2.2-4.5 V. A practical CMOS load - a low power wireless radio module has been demonstrated to operate intermittently by this approach. This topology can be easily modified for driving other practical loads, from harvested RF energy at different frequencies and power levels.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study reports the development and performance evaluation of prototypes of biogas-fuelled stationary power generators in the range of 1 kW. Strategies to achieve high engine efficiency namely pulsed manifold injection, electronic throttle control and dual spark plugs, have been incorporated in the prototype. A complete closed-loop control of the engine operation to maintain a steady engine speed of 3000 rpm (+/- 5%) across the entire load range while maintaining an optimum fuel-air equivalence ratio is made possible by an electronic control unit (ECU) controlling the injection duration, ignition timing and throttle position. This study specifically focuses on the response of the generator to transient loads, and the overall efficiency obtained. The results obtained from testing the prototype have been found to be satisfactory and show that biogas power generators for low power applications can be made efficient (overall efficiency of 19% at electrical load of 640 W) using the strategies of biogas fuel injection.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Body Area Network, a new wireless networking paradigm, promises to revolutionize the healthcare applications. A number of tiny sensor nodes are strategically placed in and around the human body to obtain physiological information. The sensor nodes are connected to a coordinator or a data collector to form a Body Area Network. The tiny devices may sense physiological parameters of emergency in nature (e.g. abnormality in heart bit rate, increase of glucose level above the threshold etc.) that needs immediate attention of a physician. Due to ultra low power requirement of wireless body area network, most of the time, the coordinator and devices are expected to be in the dormant mode, categorically when network is not operational. This leads to an open question, how to handle and meet the QoS requirement of emergency data when network is not operational? Emergency handling becomes more challenging at the MAC layer, if the channel access related information is unknown to the device with emergency message. The aforementioned scenarios are very likely scenarios in a MICS (Medical Implant Communication Service, 402-405 MHz) based healthcare systems. This paper proposes a mechanism for timely and reliable transfer of emergency data in a MICS based Body Area Network. We validate our protocol design with simulation in a C++ framework. Our simulation results show that more than 99 p ercentage of the time emergency messages are reached at the coordinator with a delay of 400ms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

High sensitivity gas sensors are typically realized using metal catalysts and nanostructured materials, utilizing non-conventional synthesis and processing techniques, incompatible with on-chip integration of sensor arrays. In this work, we report a new device architecture, suspended core-shell Pt-PtOx nanostructure that is fully CMOS-compatible. The device consists of a metal gate core, embedded within a partially suspended semiconductor shell with source and drain contacts in the anchored region. The reduced work function in suspended region, coupled with builtin electric field of metal-semiconductor junction, enables the modulation of drain current, due to room temperature Redox reactions on exposure to gas. The device architecture is validated using Pt-PtO2 suspended nanostructure for sensing H-2 down to 200 ppb under room temperature. By exploiting catalytic activity of PtO2, in conjunction with its p-type semiconducting behavior, we demonstrate about two orders of magnitude improvement in sensitivity and limit of detection, compared to the sensors reported in recent literature. Pt thin film, deposited on SiO2, is lithographically patterned and converted into suspended Pt-PtO2 sensor, in a single step isotropic SiO2 etching. An optimum design space for the sensor is elucidated with the initial Pt film thickness ranging between 10 nm and 30 nm, for low power (< 5 mu W), room temperature operation. (C) 2015 AIP Publishing LLC.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we study breakdown characteristics in shallow-trench isolation (STI)-type drain-extended MOSFETs (DeMOS) fabricated using a low-power 65-nm triple-well CMOS process with a thin gate oxide. Experimental data of p-type STI-DeMOS device showed distinct two-stage behavior in breakdown characteristics in both OFF-and ON-states, unlike the n-type device, causing a reduction in the breakdown voltage and safe operating area. The first-stage breakdown occurs due to punchthrough in the vertical structure formed by p-well, deep n-well, and p-substrate, whereas the second-stage breakdown occurs due to avalanche breakdown of lateral n-well/p-well junction. The breakdown characteristics are also compared with the STI-DeNMOS device structure. Using the experimental results and advanced TCAD simulations, a complete understanding of breakdown mechanisms is provided in this paper for STI-DeMOS devices in advanced CMOS processes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Clock synchronization is highly desirable in distributed systems, including many applications in the Internet of Things and Humans. It improves the efficiency, modularity, and scalability of the system, and optimizes use of event triggers. For IoTH, BLE - a subset of the recent Bluetooth v4.0 stack - provides a low-power and loosely coupled mechanism for sensor data collection with ubiquitous units (e.g., smartphones and tablets) carried by humans. This fundamental design paradigm of BLE is enabled by a range of broadcast advertising modes. While its operational benefits are numerous, the lack of a common time reference in the broadcast mode of BLE has been a fundamental limitation. This article presents and describes CheepSync, a time synchronization service for BLE advertisers, especially tailored for applications requiring high time precision on resource constrained BLE platforms. Designed on top of the existing Bluetooth v4.0 standard, the CheepSync framework utilizes low-level time-stamping and comprehensive error compensation mechanisms for overcoming uncertainties in message transmission, clock drift, and other system-specific constraints. CheepSync was implemented on custom designed nRF24Cheep beacon platforms (as broadcasters) and commercial off-the-shelf Android ported smartphones (as passive listeners). We demonstrate the efficacy of CheepSync by numerous empirical evaluations in a variety of experimental setups, and show that its average (single-hop) time synchronization accuracy is in the 10 mu s range.