6 resultados para accelerator

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Field Programmable Gate Array (FPGA) based hardware accelerator for multi-conductor parasitic capacitance extraction, using Method of Moments (MoM), is presented in this paper. Due to the prohibitive cost of solving a dense algebraic system formed by MoM, linear complexity fast solver algorithms have been developed in the past to expedite the matrix-vector product computation in a Krylov sub-space based iterative solver framework. However, as the number of conductors in a system increases leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products present a time bottleneck, especially for ill-conditioned system matrices. In this work, an FPGA based hardware implementation is proposed to parallelize the iterative matrix solution for multiple RHS vectors in a low-rank compression based fast solver scheme. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple conductors in a Ball Grid Array (BGA) package. Speed-ups up to 13x over equivalent software implementation on an Intel Core i5 processor for dense matrix-vector products and 12x for QR compressed matrix-vector products is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications oil general purpose multicore architectures. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on accelerators such as Graphics Processing Units (GPUs) or CellBE which support abundant parallelism in hardware. In this paper, we describe a novel method to orchestrate the execution of if StreamIt program oil a multicore platform equipped with an accelerator. The proposed approach identifies, using profiling, the relative benefits of executing a task oil the superscalar CPU cores and the accelerator. We formulate the problem of partitioning the work between the CPU cores and the GPU, taking into account the latencies for data transfers and the required buffer layout transformations associated with the partitioning, as all integrated Integer Linear Program (ILP) which can then be solved by an ILP solver. We also propose an efficient heuristic algorithm for the work-partitioning between the CPU and the GPU, which provides solutions which are within 9.05% of the optimal solution on an average across the benchmark Suite. The partitioned tasks are then software pipelined to execute oil the multiple CPU cores and the Streaming Multiprocessors (SMs) of the GPU. The software pipelining algorithm orchestrates the execution between CPU cores and the GPU by emitting the code for the CPU and the GPU, and the code for the required data transfers. Our experiments on a platform with 8 CPU cores and a GeForce 8800 GTS 512 GPU show a geometric mean speedup of 6.94X with it maximum of 51.96X over it single threaded CPU execution across the StreamIt benchmarks. This is a 18.9% improvement over it partitioning strategy that maps only the filters that cannot be executed oil the GPU - the filters with state that is persistent across firings - onto the CPU.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present paper deals with the study of the effects of electron (8 MeV) irradiation on the dielectric and ferroelectric properties of PbZrO3 thin films grown by sol-gel technique. The films were (0.62 mu m thick) subjected to electron irradiation using Microtron accelerator (delivered dose 80, 100, 120 kGy). The films were well crystallized prior to and after electron irradiation. However, local amorphization was observed after irradiation. There is an appreciable change in the dielectric constant after irradiation with different delivered doses. The dielectric loss showed significant frequency dispersion for both unirradiated and electron irradiated films. T (c) was found to shift towards higher temperature with increasing delivered dose. The effect of radiation induced increase of E >'(T) is related to an internal bias field, which is caused by radiation induced charges trapped at grain boundaries. The double butterfly loop is retained even after electron irradiation to the different delivered doses. The broader hysteresis loop seems to be related to radiation induced charges causing an enhanced space charge polarization. Radiation-induced oxygen vacancies do not change the general shape of the AFE hysteresis loop but they increase P (s) of the hysteresis at the electric field forced AFE to FE phase transition. We attribute the changes in the dielectric properties to the structural defects such as oxygen vacancies and radiation induced charges. The shift in T (C), increase in dielectric constant, broader hysteresis loop, and increase in P (r) can be related to radiation induced charges causing space charge polarization. Double butterfly and hysteresis loops were retained indicative of AFE nature of the films.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nano sized copper chromite, which is used as a burn rate accelerator for solid propellants, was synthesized by the solution combustion process using citric acid and glycine as fuel. Pure spinel phase copper chromite (CuCr2O4) was synthesized, and the effect of different ratios of Cu-Cr ions in the initial reactant and various calcination temperatures on the final properties of the material were examined. The reaction time for the synthesis with glycine was lower compared to that with citric acid. The synthesized samples from both fuel cycles were characterized by X-ray diffraction (XRD), X-ray photoelectron spectroscopy (XPS), BET surface area analysis, and scanning electron microscope (SEM). Commercial copper chromite that is currently used in solid propellant formulation was also characterized by the same techniques. XRD analysis shows that the pure spinel phase compound is formed by calcination at 700 degrees C for glycine fuel cycle and between 750 and 800 degrees C for citric acid cycle. XPS results indicate the variation of the oxidation state of copper in the final compound with a change in the Cu-Cr mole ratio. SEM images confirm the formation of nano size spherical shape particles. The variation of BET surface area with calcination temperature was studied for the solution combusted catalyst. Burn rate evaluation of synthesized catalyst was carried out and compared with the commercial catalyst. The comparison between BET surface area and the burn rate depicts that surface area difference caused the variation in burn rate between samples. The reason behind the reduction in surface area and the required modifications in the process are also described.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a hardware-software hybrid technique for modular multiplication over large binary fields. The technique involves application of Karatsuba-Ofman algorithm for polynomial multiplication and a novel technique for reduction. The proposed reduction technique is based on the popular repeated multiplication technique and Barrett reduction. We propose a new design of a parallel polynomial multiplier that serves as a hardware accelerator for large field multiplications. We show that the proposed reduction technique, accelerated using the modified polynomial multiplier, achieves significantly higher performance compared to a purely software technique and other hybrid techniques. We also show that the hybrid accelerated approach to modular field multiplication is significantly faster than the Montgomery algorithm based integrated multiplication approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this article, a Field Programmable Gate Array (FPGA)-based hardware accelerator for 3D electromagnetic extraction, using Method of Moments (MoM) is presented. As the number of nets or ports in a system increases, leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products presents a time bottleneck in a linear-complexity fast solver framework. In this work, an FPGA-based hardware implementation is proposed toward a two-level parallelization scheme: (i) matrix level parallelization for single RHS and (ii) pipelining for multiple-RHS. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple nets in a Ball Grid Array (BGA) package. The acceleration is shown to be linearly scalable with FPGA resources and speed-ups over 10x against equivalent software implementation on a 2.4GHz Intel Core i5 processor is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board with the implemented design operating at 200MHz clock frequency. (c) 2016 Wiley Periodicals, Inc. Microwave Opt Technol Lett 58:776-783, 2016