174 resultados para Heterogeneous Regressions Algorithms
Resumo:
We propose a data flow based run time system as an efficient tool for supporting execution of parallel code on heterogeneous architectures hosting both multicore CPUs and GPUs. We discuss how the proposed run time system may be the target of both structured parallel applications developed using algorithmic skeletons/parallel design patterns and also more "domain specific" programming models. Experimental results demonstrating the feasibility of the approach are presented. © 2012 World Scientific Publishing Company.
Stochastic Analysis of Seepage under Hydraulic Structures Resting on Anisotropic Heterogeneous Soils
Resumo:
Local computation in join trees or acyclic hypertrees has been shown to be linked to a particular algebraic structure, called valuation algebra.There are many models of this algebraic structure ranging from probability theory to numerical analysis, relational databases and various classical and non-classical logics. It turns out that many interesting models of valuation algebras may be derived from semiring valued mappings. In this paper we study how valuation algebras are induced by semirings and how the structure of the valuation algebra is related to the algebraic structure of the semiring. In particular, c-semirings with idempotent multiplication induce idempotent valuation algebras and therefore permit particularly efficient architectures for local computation. Also important are semirings whose multiplicative semigroup is embedded in a union of groups. They induce valuation algebras with a partially defined division. For these valuation algebras, the well-known architectures for Bayesian networks apply. We also extend the general computational framework to allow derivation of bounds and approximations, for when exact computation is not feasible.
Resumo:
Recently, a number of most significant digit (msd) first bit parallel multipliers for recursive filtering have been reported. However, the design approach which has been used has, in general, been heuristic and consequently, optimality has not always been assured. In this paper, msd first multiply accumulate algorithms are described and important relationships governing the dependencies between latency, number representations, etc are derived. A more systematic approach to designing recursive filters is illustrated by applying the algorithms and associated relationships to the design of cascadable modules for high sample rate IIR filtering and wave digital filtering.
Resumo:
Real time digital signal processing demands high performance implementations of division and square root. This can only be achieved by the design of fast and efficient arithmetic algorithms which address practical VLSI architectural design issues. In this paper, new algorithms for division and square root are described. The new schemes are based on pre-scaling the operands and modifying the classical SRT method such that the result digits and the remainders are computed concurrently and the computations in adjacent rows are overlapped. Consequently, their performance exceeds that of the SRT methods. The hardware cost for higher radices is considerably more than that of the SRT methods but for many applications, this is not prohibitive. A system of equations is presented which enables both an analysis of the method for any radix and the parameters of implementations to be easily determined. This is illustrated for the case of radix 2 and radix 4. In addition, a highly regular array architecture combining the division and square root method is described. © 1994 Kluwer Academic Publishers.
Resumo:
A methodology has been developed which allows a non-specialist to rapidly design silicon wavelet transform cores for a variety of specifications. The cores include both forward and inverse orthonormal wavelet transforms. This methodology is based on efficient, modular and scaleable architectures utilising time-interleaved coefficients for the wavelet transform filters. The cores are parameterized in terms of wavelet type and data and coefficient word lengths. The designs have been captured in VHDL and are hence portable across a range of silicon foundries as well as FPGA and PLD implementations.
Resumo:
In real time digital signal processing, high performance modules for division and square root are essential if many powerful algorithms are to be implemented. In this paper, a new radix 2 algorithms for SRT division and square root are developed. For these new schemes, the result digits and the residuals are computed concurrently and the computations in adjacent rows are overlapped. Consequently, their performance should exceed that of the radix 2 SRT methods. VLSI array architectures to implement the new division and square root schemes are also presented.
Resumo:
A genetic algorithm (GA) was adopted to optimise the response of a composite laminate subject to impact. Two different impact scenarios are presented: low-velocity impact of a slender laminated strip and high-velocity impact of a rectangular plate by a spherical impactor. In these cases, the GA's objective was to, respectively, minimise the peak deflection and minimise penetration by varying the ply angles.
The GA was coupled to a commercial finite-element (FE) package LS DYNA to perform the impact analyses. A comparison with a commercial optimisation package, LS OPT, was also made. The results showed that the GA was a robust, capable optimisation tool that produced near optimal designs, and performed well with respect to LS OPT for the more complex high-velocity impact scenario tested.
Resumo:
The treatment of the Random-Phase Approximation Hamiltonians, encountered in different frameworks, like time-dependent density functional theory or Bethe-Salpeter equation, is complicated by their non-Hermicity. Compared to their Hermitian Hamiltonian counterparts, computational methods for the treatment of non-Hermitian Hamiltonians are often less efficient and less stable, sometimes leading to the breakdown of the method. Recently [Gruning et al. Nano Lett. 8 (2009) 28201, we have identified that such Hamiltonians are usually pseudo-Hermitian. Exploiting this property, we have implemented an algorithm of the Lanczos type for Random-Phase Approximation Hamiltonians that benefits from the same stability and computational load as its Hermitian counterpart, and applied it to the study of the optical response of carbon nanotubes. We present here the related theoretical grounds and technical details, and study the performance of the algorithm for the calculation of the optical absorption of a molecule within the Bethe-Salpeter equation framework. (C) 2011 Elsevier B.V. All rights reserved.