51 resultados para Instruction set

em Indian Institute of Science - Bangalore - Índia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we present a framework for realizing arbitrary instruction set extensions (IE) that are identified post-silicon. The proposed framework has two components viz., an IE synthesis methodology and the architecture of a reconfigurable data-path for realization of the such IEs. The IE synthesis methodology ensures maximal utilization of resources on the reconfigurable data-path. In this context we present the techniques used to realize IEs for applications that demand high throughput or those that must process data streams. The reconfigurable hardware called HyperCell comprises a reconfigurable execution fabric. The fabric is a collection of interconnected compute units. A typical use case of HyperCell is where it acts as a co-processor with a host and accelerates execution of IEs that are defined post-silicon. We demonstrate the effectiveness of our approach by evaluating the performance of some well-known integer kernels that are realized as IEs on HyperCell. Our methodology for realizing IEs through HyperCells permits overlapping of potentially all memory transactions with computations. We show significant improvement in performance for streaming applications over general purpose processor based solutions, by fully pipelining the data-path. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Emerging embedded applications are based on evolving standards (e.g., MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a polymorphic ASIC in which specialized hardware units are replaced with basic hardware units that can create the same functionality by runtime re-composition. It is a ``future-proof'' custom hardware solution for multiple applications and their derivatives in a domain. In this article, we describe a compiler framework and supporting hardware comprising compute, storage, and communication resources. Applications described in high-level language (e.g., C) are compiled into application substructures. For each application substructure, a set of compute elements on the hardware are interconnected during runtime to form a pattern that closely matches the communication pattern of that particular application. The advantage is that the bounded CEs are neither processor cores nor logic elements as in FPGAs. Hence, REDEFINE offers the power and performance advantage of an ASIC and the hardware reconfigurability and programmability of that of an FPGA/instruction set processor. In addition, the hardware supports custom instruction pipelining. Existing instruction-set extensible processors determine a sequence of instructions that repeatedly occur within the application to create custom instructions at design time to speed up the execution of this sequence. We extend this scheme further, where a kernel is compiled into custom instructions that bear strong producer-consumer relationship (and not limited to frequently occurring sequences of instructions). Custom instructions, realized as hardware compositions effected at runtime, allow several instances of the same to be active in parallel. A key distinguishing factor in majority of the emerging embedded applications is stream processing. To reduce the overheads of data transfer between custom instructions, direct communication paths are employed among custom instructions. In this article, we present the overview of the hardware-aware compiler framework, which determines the NoC-aware schedule of transports of the data exchanged between the custom instructions on the interconnect. The results for the FFT kernel indicate a 25% reduction in the number of loads/stores, and throughput improves by log(n) for n-point FFT when compared to sequential implementation. Overall, REDEFINE offers flexibility and a runtime reconfigurability at the expense of 1.16x in power and 8x in area when compared to an ASIC. REDEFINE implementation consumes 0.1x the power of an FPGA implementation. In addition, the configuration overhead of the FPGA implementation is 1,000x more than that of REDEFINE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

ASICs offer the best realization of DSP algorithms in terms of performance, but the cost is prohibitive, especially when the volumes involved are low. However, if the architecture synthesis trajectory for such algorithms is such that the target architecture can be identified as an interconnection of elementary parameterized computational structures, then it is possible to attain a close match, both in terms of performance and power with respect to an ASIC, for any algorithmic parameters of the given algorithm. Such an architecture is weakly programmable (configurable) and can be viewed as an application specific instruction-set processor (ASIP). In this work, we present a methodology to synthesize ASIPs for DSP algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new approach is proposed to solve for the growth as well as the movement of hydrogen bubbles during solidification in aluminum castings. A level-set methodology has been adopted to handle this multiphase phenomenon. A microscale domain is considered and the growth and movement of hydrogen bubbles in this domain has been studied. The growth characteristics of hydrogen bubbles have been evaluated under free growth conditions in a melt having a hydrogen input caused b solidification occurring around the microdomain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes an algorithm to compute the union, intersection and difference of two polygons using a scan-grid approach. Basically, in this method, the screen is divided into cells and the algorithm is applied to each cell in turn. The output from all the cells is integrated to yield a representation of the output polygon. In most cells, no computation is required and thus the algorithm is a fast one. The algorithm has been implemented for polygons but can be extended to polyhedra as well. The algorithm is shown to take O(N) time in the average case where N is the total number of edges of the two input polygons.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the bi-criteria single machine scheduling problem of n jobs with a learning effect. The two objectives considered are the total completion time (TC) and total absolute differences in completion times (TADC). The objective is to find a sequence that performs well with respect to both the objectives: the total completion time and the total absolute differences in completion times. In an earlier study, a method of solving bi-criteria transportation problem is presented. In this paper, we use the methodology of solvin bi-criteria transportation problem, to our bi-criteria single machine scheduling problem with a learning effect, and obtain the set of optimal sequences,. Numerical examples are presented for illustrating the applicability and ease of understanding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report here that the structural origin of an easily reversible Ge15Te83Si2 glass can be a promising candidate for phase change random access memories. In situ Raman scattering studies on Ge15Te83Si2 sample, undertaken during the amorphous set and reset processes, indicate that the degree of disorder in the glass is reduced from off to set state. It is also found that the local structure of the sample under reset condition is similar to that in the amorphous off state. Electron microscopic studies on switched samples indicate the formation of nanometric sized particles of c-SiTe2 structure. ©2009 American Institute of Physics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The current-biased single electron transistor (SET) (CBS) is an integral part of almost all hybrid CMOS SET circuits. In this paper, for the first time, the effects of energy quantization on the performance of CBS-based circuits are studied through analytical modeling and Monte Carlo simulations. It is demonstrated that energy quantization has no impact on the gain of the CBS characteristics, although it changes the output voltage levels and oscillation periodicity. The effects of energy quantization are further studied for two circuits: negative differential resistance (NDR) and neuron cell, which use the CBS. A new model for the conductance of NDR characteristics is also formulated that includes the energy quantization term.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The maximum independent set problem is NP-complete even when restricted to planar graphs, cubic planar graphs or triangle free graphs. The problem of finding an absolute approximation still remains NP-complete. Various polynomial time approximation algorithms, that guarantee a fixed worst case ratio between the independent set size obtained to the maximum independent set size, in planar graphs have been proposed. We present in this paper a simple and efficient, O(|V|) algorithm that guarantees a ratio 1/2, for planar triangle free graphs. The algorithm differs completely from other approaches, in that, it collects groups of independent vertices at a time. Certain bounds we obtain in this paper relate to some interesting questions in the theory of extremal graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is shown that a method based on the principle of analytic continuation can be used to solve a set of inhomogeneous infinite simultaneous equations encountered in the analysis of surface acoustic wave propagation along the periodically perturbed surface of a piezoelectric medium.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is shown that a method based on the principle of analytic continuation can be used to solve a set of infinite simultaneous equations encountered in solving for the electric field of a periodic electrode structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A compact model for noise margin (NM) of single-electron transistor (SET) logic is developed, which is a function of device capacitances and background charge (zeta). Noise margin is, then, used as a metric to evaluate the robustness of SET logic against background charge, temperature, and variation of SET gate and tunnel junction capacitances (CG and CT). It is shown that choosing alpha=CT/CG=1/3 maximizes the NM. An estimate of the maximum tolerable zeta is shown to be equal to plusmn0.03 e. Finally, the effect of mismatch in device parameters on the NM is studied through exhaustive simulations, which indicates that a isin [0.3, 0.4] provides maximum robustness. It is also observed that mismatch can have a significant impact on static power dissipation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bulk Ge15Te83Si2 glass has been found to exhibit memory-type switching for 1 mA current with a threshold electric field of 7.3 kV/cm. The electrical set and reset processes have been achieved with triangular and rectangular pulses, respectively, of 1 mA amplitude. In situ Raman scattering studies indicate that the degree of disorder in Ge15Te83Si2 glass is reduced from off to set state. The local structure of the sample under reset condition is similar to that in the off state. The Raman results are consistent with the switching results which indicate that the Ge15Te83Si2 glass can be set and reset easily. (C) 2007 American Institute of Physics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditionally, an instruction decoder is designed as a monolithic structure that inhibit the leakage energy optimization. In this paper, we consider a split instruction decoder that enable the leakage energy optimization. We also propose a compiler scheduling algorithm that exploits instruction slack to increase the simultaneous active and idle duration in instruction decoder. The proposed compiler-assisted scheme obtains a further 14.5% reduction of energy consumption of instruction decoder over a hardware-only scheme for a VLIW architecture. The benefits are 17.3% and 18.7% in the context of a 2-clustered and a 4-clustered VLIW architecture respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper the static noise margin for SET (single electron transistor) logic is defined and compact models for the noise margin are developed by making use of the MIB (Mahapatra-Ionescu-Banerjee) model. The variation of the noise margin with temperature and background charge is also studied. A chain of SET inverters is simulated to validate the definition of various logic levels (like VIH, VOH, etc.) and noise margin. Finally the noise immunity of SET logic is compared with current CMOS logic.