113 resultados para chip
em Indian Institute of Science - Bangalore - Índia
Resumo:
We present a technique for an all-digital on-chip delay measurement system to measure the skews in a clock distribution network. It uses the principle of sub-sampling. Measurements from a prototype fabricated in a 65 nm industrial process, indicate the ability to measure delays with a resolution of 0.5ps and a DNL of 1.2 ps.
Resumo:
Chips were produced by orthogonal Cutting of cast pure magnesium billet with three different tool rake angles viz., -15 degrees, -5 degrees and +15 degrees on a lathe. Chip consolidation by solid state recycling technique involved cold compaction followed by hot extrusion. The extruded products were characterized for microstructure and mechanical properties. Chip-consolidated products from -15 degrees rake angle tools showed 19% increase in tensile strength, 60% reduction ingrain size and 12% increase in hardness compared to +15 degrees rake chip-consolidated product indicating better chip bonding and grain refinement. Microstructure of the fracture specimen Supports the abovefinding. On the overall, the present work high lights the importance of tool take angle in determining the quality of the chip-consolidated products. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
A large part of today's multi-core chips is interconnect. Increasing communication complexity has made essential new strategies for interconnects, such as Network on Chip. Power dissipation in interconnects has become a substantial part of the total power dissipation. Techniques to reduce interconnect power have thus become a necessity. In this paper, we present a design methodology that gives values of bus width for interconnect links, frequency of operation for routers, in Network on Chip scenario that satisfy required throughput and dissipate minimal switching power. We develop closed form analytical expressions for the power dissipation, with bus width and frequency as variables and then use Lagrange multiplier method to arrive at the optimal values. We present a 4 port router in 90 nm technology library as case study. The results obtained from analysis are discussed.
Resumo:
RECONNECT is a Network-on-Chip using a honeycomb topology. In this paper we focus on properties of general rules applicable to a variety of routing algorithms for the NoC which take into account the missing links of the honeycomb topology when compared to a mesh. We also extend the original proposal [5] and show a method to insert and extract data to and from the network. Access Routers at the boundary of the execution fabric establish connections to multiple periphery modules and create a torus to decrease the node distances. Our approach is scalable and ensures homogeneity among the compute elements in the NoC. We synthesized and evaluated the proposed enhancement in terms of power dissipation and area. Our results indicate that the impact of necessary alterations to the fabric is negligible and effects the data transfer between the fabric and the periphery only marginally.
Resumo:
Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper we introduce a new energy-efficient fault-tolerant CMP architecture known as Redundant Execution using Critical Value Forwarding (RECVF). RECVF is based on two observations: (i) forwarding critical instruction results from the leading to the trailing core enables the latter to execute faster, and (ii) this speedup can be exploited to reduce energy consumption by operating the trailing core at a lower voltage-frequency level. Our evaluation shows that RECVF consumes 37% less energy than conventional dual modular redundant (DMR) execution of a program. It consumes only 1.26 times the energy of a non-fault-tolerant baseline and has a performance overhead of just 1.2%.
Resumo:
A microchip thermocycler, fabricated from silicon and Pyrex #7740 glass, is described. Usual resistive heating has been replaced by induction heating, leading to much simpler fabrication steps. Heating and cooling rates of 6.5 and 4.2 degreesC/s, respectively have been achieved, by optimising the heater dimensions and heating frequency (similar to200 kHz). Four devices are mounted on a heater, resulting in low power consumption (similar to 1.4 W per device on the average). Using simple on-off electronic temperature control, a temperature stability within -0.2 degreesC is achieved. Features such as induction heating, good temperature control, battery operation, and low power consumption make the device suitable for portable applications, particularly in polymerase chain reaction (PCR) systems. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
We describe a System-C based framework we are developing, to explore the impact of various architectural and microarchitectural level parameters of the on-chip interconnection network elements on its power and performance. The framework enables one to choose from a variety of architectural options like topology, routing policy, etc., as well as allows experimentation with various microarchitectural options for the individual links like length, wire width, pitch, pipelining, supply voltage and frequency. The framework also supports a flexible traffic generation and communication model. We provide preliminary results of using this framework to study the power, latency and throughput of a 4x4 multi-core processing array using mesh, torus and folded torus, for two different communication patterns of dense and sparse linear algebra. The traffic consists of both Request-Response messages (mimicing cache accesses)and One-Way messages. We find that the average latency can be reduced by increasing the pipeline depth, as it enables higher link frequencies. We also find that there exists an optimum degree of pipelining which minimizes energy-delay product.
Resumo:
This paper describes the design of a power efficient microarchitecture for transient fault detection in chip multiprocessors (CMPs) We introduce a new per-core dynamic voltage and frequency scaling (DVFS) algorithm for our architecture that significantly reduces power dissipation for redundant execution with a minimal performance overhead. Using cycle accurate simulation combined with a simple first order power model, we estimate that our architecture reduces dynamic power dissipation in the redundant core by an mean value of 79% and a maximum of 85% with an associated mean performance overhead of only 1:2%
Resumo:
Scalable Networks on Chips (NoCs) are needed to match the ever-increasing communication demands of large-scale Multi-Processor Systems-on-chip (MPSoCs) for multi media communication applications. The heterogeneous nature of application specific on-chip cores along with the specific communication requirements among the cores calls for the design of application-specific NoCs for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular networks-on-chip. The proposed method exploits a priori knowledge of the applications communication characteristic to generate an optimized network topology and corresponding routing tables.
Resumo:
The memory subsystem is a major contributor to the performance, power, and area of complex SoCs used in feature rich multimedia products. Hence, memory architecture of the embedded DSP is complex and usually custom designed with multiple banks of single-ported or dual ported on-chip scratch pad memory and multiple banks of off-chip memory. Building software for such large complex memories with many of the software components as individually optimized software IPs is a big challenge. In order to obtain good performance and a reduction in memory stalls, the data buffers of the application need to be placed carefully in different types of memory. In this paper we present a unified framework (MODLEX) that combines different data layout optimizations to address the complex DSP memory architectures. Our method models the data layout problem as multi-objective genetic algorithm (GA) with performance and power being the objectives and presents a set of solution points which is attractive from a platform design viewpoint. While most of the work in the literature assumes that performance and power are non-conflicting objectives, our work demonstrates that there is significant trade-off (up to 70%) that is possible between power and performance.
Resumo:
Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences these parameters. Hence the embedded system designer performs a complete memory architecture exploration. This problem is a multi-objective optimization problem and can be tackled as a two-level optimization problem. The outer level explores various memory architecture while the inner level explores placement of data sections (data layout problem) to minimize memory stalls. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of Multi-objective Genetic Algorithm (Memory Architecture exploration) and an efficient heuristic data placement algorithm. At the outer level the memory architecture exploration is done by picking memory modules directly from a ASIC memory Library. This helps in performing the memory architecture exploration in a integrated framework, where the memory allocation, memory exploration and data layout works in a tightly coupled way to yield optimal design points with respect to area, power and performance. We experimented our approach for 3 embedded applications and our approach explores several thousand memory architecture for each application, yielding a few hundred optimal design points in a few hours of computation time on a standard desktop.
Resumo:
Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper, we describe a power-efficient architecture for redundant execution on chip multiprocessors (CMPs) which when coupled with our per-core dynamic voltage and frequency scaling (DVFS) algorithm significantly reduces the energy overhead of redundant execution without sacrificing performance. Our evaluation shows that this architecture has a performance overhead of only 0.3% and consumes only 1.48 times the energy of a non-fault-tolerant baseline.