113 resultados para chip
Resumo:
Earlier studies have exploited statistical multiplexing of flows in the core of the Internet to reduce the buffer requirement in routers. Reducing the memory requirement of routers is important as it enables an improvement in performance and at the same time a decrease in the cost. In this paper, we observe that the links in the core of the Internet are typically over-provisioned and this can be exploited to reduce the buffering requirement in routers. The small on-chip memory of a network processor (NP) can be effectively used to buffer packets during most regimes of traffic. We propose a dynamic buffering strategy which buffers packets in the receive and transmit buffers of a NP when the memory requirement is low. When the buffer requirement increases due to bursts in the traffic, memory is allocated to packets in the off-chip DRAM. This scheme effectively mitigates the DRAM access bottleneck, as only a part of the traffic is stored in the DRAM. We build a Petri net model and evaluate the proposed scheme with core Internet like traffic. At 77% link utilization, the dynamic buffering scheme has a drop rate of just 0.65%, whereas the traditional DRAM buffering has 4.64% packet drop rate. Even with a high link utilization of 90%, which rarely happens in the core, our dynamic buffering results in a packet drop rate of only 2.17%, while supporting a throughput of 7.39 Gbps. We study the proposed scheme under different conditions to understand the provisioning of processing threads and to determine the queue length at which packets must be buffered in the DRAM. We show that the proposed dynamic buffering strategy drastically reduces the buffering requirement while still maintaining low packet drop rates.
Resumo:
In this paper we explore an implementation of a high-throughput, streaming application on REDEFINE-v2, which is an enhancement of REDEFINE. REDEFINE is a polymorphic ASIC combining the flexibility of a programmable solution with the execution speed of an ASIC. In REDEFINE Compute Elements are arranged in an 8x8 grid connected via a Network on Chip (NoC) called RECONNECT, to realize the various macrofunctional blocks of an equivalent ASIC. For a 1024-FFT we carry out an application-architecture design space exploration by examining the various characterizations of Compute Elements in terms of the size of the instruction store. We further study the impact by using application specific, vectorized FUs. By setting up different partitions of the FFT algorithm for persistent execution on REDEFINE-v2, we derive the benefits of setting up pipelined execution for higher performance. The impact of the REDEFINE-v2 micro-architecture for any arbitrary N-point FFT (N > 4096) FFT is also analyzed. We report the various algorithm-architecture tradeoffs in terms of area and execution speed with that of an ASIC implementation. In addition we compare the performance gain with respect to a GPP.
Resumo:
The sum capacity on a symbol-synchronous CDMA system having processing gain N and supporting K power constrained users is achieved by employing any set of N orthogonal sequences if a few users are allowed to signal along multiple dimensions. Analogously, the minimum received power (energy-per-chip) on the symbolsynchronous CDMA system supporting K users that demand specified data rates is attained by employing any set of N orthogonal sequences. At most (N - 1) users need to be split and if there are no oversized users, these split users need to signal only in two dimensions each. These results show that sum capacity or minimum sum power can be achieved with minimal downlink signaling.
Resumo:
A low-power frequency multiplication technique, developed for ZigBee (IEEE 802.15.4) like applications is presented. We have provided an estimate for the power consumption for a given output voltage swing using our technique. The advantages and disadvantages which determine the application areas of the technique are discussed. The issues related to design, layout and process variation are also addressed. Finally, a design is presented for operation in 2.405-2.485-GHz band of ZigBee receiver. SpectreRF simulations show 30% improvement in efficiency for our circuit with regard to conversion of DC bias current to output amplitude, against a LC-VCO. To establish the low-power credentials, we have compared our circuit with an existing technique; our circuit performs better with just 1/3 of total current from supply, and uses one inductor as against three in the latter case. A test chip was implemented in UMC 0.13-mum RF process with spiral on-chip inductors and MIM (metal-insulator-metal) capacitor option.
INTACTE: An Interconnect Area, Delay, and Energy Estimation Tool for Microarchitectural Explorations
Resumo:
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters. Hence these models are hard to use directly to make high level microarchitectural trade-offs in the initial exploration phase of a design. In this paper, we propose INTACTE, a tool that can be used by architects toget reasonably accurate interconnect area, delay, and power estimates based on a few architecture level parameters for the interconnect such as length, width (in number of bits), frequency, and latency for a specified technology and voltage. The tool uses well known models of interconnect delay and energy taking into account the wire pitch, repeater size, and spacing for a range of voltages and technologies.It then solves an optimization problem of finding the lowest energy interconnect design in terms of the low level circuit parameters, which meets the architectural constraintsgiven as inputs. In addition, the tool also provides the area, energy, and delay for a range of supply voltages and degrees of pipelining, which can be used for micro-architectural exploration of a chip. The delay and energy models used by the tool have been validated against low level circuit simulations. We discuss several potential applications of the tool and present an example of optimizing interconnect design in the context of clustered VLIW architectures. Copyright 2007 ACM.
Resumo:
The similar to 1300-km-long rupture zone of the 2004 Andaman-Sumatra megathrust earthquake continues to generate a mix of thrust, normal, and strike-slip faulting events. The 12 June 2010 M(w) 7.5 event on the subducting plate is the most recent large earthquake on the Nicobar segment. The left-lateral faulting mechanism of this event is unusual for the outer-rise region, considering the stress transfer processes that follow great underthrusting earthquakes. Another earthquake (M(w) 7.2) with a similar mechanism occurred very close to this event on 24 July 2005. These earthquakes and most of their aftershocks on the subducting plate were generated by left-lateral strike-slip faulting on north-northeast-south-southwest oriented near-vertical faults, in response to north-northwest-south-southeast directed compression. Pre-2004 earthquake faulting mechanisms on the subducting oceanic plate are consistent with this pattern. Post-2004, left-lateral faulting on the subducting oceanic plate clusters between 5 degrees N and 9 degrees N, where the 90 degrees E ridge impinges the trench axis. Our study observes that the subducting plate off the Sumatra and Nicobar segments behaves similarly to a chip of the India-Australia plate, deforming in response to a generally northwest-southeast oriented compression, an aspect that must be factored into the plate deformation models.
Resumo:
An all-digital on-chip clock skew measurement system via subsampling is presented. The clock nodes are sub-sampled with a near-frequency asynchronous sampling clock to result in beat signals which are themselves skewed in the same proportion but on a larger time scale. The beat signals are then suitably masked to extract only the skews of the rising edges of the clock signals. We propose a histogram of the arithmetic difference of the beat signals which decouples the relationship of clock jitter to the minimum measurable skew, and allows skews arbitrarily close to zero to be measured with a precision limited largely by measurement time, unlike the conventional XOR based histogram approach. We also analytically show that the proposed approach leads to an unbiased estimate of skew. The measured results from a 65 nm delay measurement front-end indicate that for an input skew range of +/- 1 fan-out-of-4 (FO4) delay, +/- 3 sigma resolution of 0.84 ps can be obtained with an integral error of 0.65 ps. We also experimentally demonstrate that a frequency modulation on a sampling clock maintains precision, indicating the robustness of the technique to jitter. We also show how FM modulation helps in restoring precision in case of rationally related clocks.
Resumo:
The success of an ABV IP depends highly on the associated debugging environment. An efficient debugging environment helps the user to find out the exact location of the failure. Moreover, it provides information to the user in a refined detail of abstraction and permit adequate interaction. It has also been realized that adequate visualization support helps in tracking the behavioral aspects of the Design Under Test (DUT). Currently, the debugging tools provide information in the signal level and do not provide any information about the high-level behavior of the DUT. We present a debugging framework that takes the design specification, assertions and the user intent in a simple format and provides detailed information by processing the design trace on-line, or off-line. We also present a visualization framework to ease the debugging procedure. We have experimented with industrial standard on-chip bus protocols that ensure that this utility can be incorporated successfully in the present functional verification flow.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
A method of precise measurement of on-chip analog voltages in a mostly-digital manner, with minimal overhead, is presented. A pair of clock signals is routed to the node of an analog voltage. This analog voltage controls the delay between this pair of clock signals, which is then measured in an all-digital manner using the technique of sub-sampling. This sub-sampling technique, having measurement time and accuracy trade-off, is well suited for low bandwidth signals. This concept is validated by designing delay cells, using current starved inverters in UMC 130nm CMOS process. Sub-mV accuracy is demonstrated for a measurement time of few seconds.
Resumo:
Continuous advances in VLSI technology have made implementation of very complicated systems possible. Modern System-on -Chips (SoCs) have many processors, IP cores and other functional units. As a result, complete verification of whole systems before implementation is becoming infeasible; hence it is likely that these systems may have some errors after manufacturing. This increases the need to find design errors in chips after fabrication. The main challenge for post-silicon debug is the observability of the internal signals. Post-silicon debug is the problem of determining what's wrong when the fabricated chip of a new design behaves incorrectly. This problem now consumes over half of the overall verification effort on large designs, and the problem is growing worse.Traditional post-silicon debug methods concentrate on functional parts of systems and provide mechanisms to increase the observability of internal state of systems. Those methods may not be sufficient as modern SoCs have lots of blocks (processors, IP cores, etc.) which are communicating with one another and communication is another source of design errors. This tutorial will be provide an insight into various observability enhancement techniques, on chip instrumentation techniques and use of high level models to support the debug process targeting both inside blocks and communication among them. It will also cover the use of formal methods to help debug process.
Resumo:
A generalized power tracking algorithm that minimizes power consumption of digital circuits by dynamic control of supply voltage and the body bias is proposed. A direct power monitoring scheme is proposed that does not need any replica and hence can sense total power consumed by load circuit across process, voltage, and temperature corners. Design details and performance of power monitor and tracking algorithm are examined by a simulation framework developed using UMC 90-nm CMOS triple well process. The proposed algorithm with direct power monitor achieves a power savings of 42.2% for activity of 0.02 and 22.4% for activity of 0.04. Experimental results from test chip fabricated in AMS 350 nm process shows power savings of 46.3% and 65% for load circuit operating in super threshold and near sub-threshold region, respectively. Measured resolution of power monitor is around 0.25 mV and it has a power overhead of 2.2% of die power. Issues with loop convergence and design tradeoff for power monitor are also discussed in this paper.
Resumo:
Chips produced by turning a commercial grade pure magnesium billet were consolidated by solid state recycling technique of cold compaction followed by hot extrusion. The cold compacted billets were extruded at four different temperatures: 250 degrees C, 300 degrees C, 350 degrees C and 400 degrees C. For the purpose of comparison, cast magnesium (pure) billets were extruded under similar conditions. Extruded products were characterized for damping properties. Damping capacity and dynamic modulus was measured as a function of time and temperature at a fixed frequency of 5 Hz 10 to 14% increase in damping capacity was observed in chip consolidated products compared to reference material. Microstructural changes after the temperature sweep tests were examined. Chip boundaries present in consolidated products were observed to suppress grain coarsening which otherwise was significant in reference material. The present work is significant from the viewpoint of recycling of machined chips and development of sustainable manufacturing processes. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Unlike most eukaryotes, a kinetochore is fully assembled early in the cell cycle in budding yeasts Saccharomyces cerevisiae and Candida albicans. These kinetochores are clustered together throughout the cell cycle. Kinetochore assembly on point centromeres of S. cerevisiae is considered to be a step-wise process that initiates with binding of inner kinetochore proteins on specific centromere DNA sequence motifs. In contrast, kinetochore formation in C. albicans, that carries regional centromeres of 3-5 kb long, has been shown to be a sequence independent but an epigenetically regulated event. In this study, we investigated the process of kinetochore assembly/disassembly in C. albicans. Localization dependence of various kinetochore proteins studied by confocal microscopy and chromatin immunoprecipitation (ChIP) assays revealed that assembly of a kinetochore is a highly coordinated and interdependent event. Partial depletion of an essential kinetochore protein affects integrity of the kinetochore cluster. Further protein depletion results in complete collapse of the kinetochore architecture. In addition, GFP-tagged kinetochore proteins confirmed similar time-dependent disintegration upon gradual depletion of an outer kinetochore protein (Dam1). The loss of integrity of a kinetochore formed on centromeric chromatin was demonstrated by reduced binding of CENP-A and CENP-C at the centromeres. Most strikingly, Western blot analysis revealed that gradual depletion of any of these essential kinetochore proteins results in concomitant reduction in cellular protein levels of CENP-A. We further demonstrated that centromere bound CENP-A is protected from the proteosomal mediated degradation. Based on these results, we propose that a coordinated interdependent circuitry of several evolutionarily conserved essential kinetochore proteins ensures integrity of a kinetochore formed on the foundation of CENP-A containing centromeric chromatin.
Resumo:
An all-digital technique is proposed for generating an accurate delay irrespective of the inaccuracies of a controllable delay line. A subsampling technique-based delay measurement unit (DMU) capable of measuring delays accurately for the full period range is used as the feedback element to build accurate fractional period delays based on input digital control bits. The proposed delay generation system periodically measures and corrects the error and maintains it at the minimum value without requiring any special calibration phase. Up to 40x improvement in accuracy is demonstrated for a commercial programmable delay generator chip. The time-precision trade-off feature of the DMU is utilized to reduce the locking time. Loop dynamics are adjusted to stabilize the delay after the minimum error is achieved, thus avoiding additional jitter. Measurement results from a high-end oscilloscope also validate the effectiveness of the proposed system in improving accuracy.