Biblioteca Digital

12 resultados para QR Microbiología

em Indian Institute of Science - Bangalore - Índia

Design space exploration of systolic realization of QR factorization on a runtime reconfigurable platform

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the world of high performance computing huge efforts have been put to accelerate Numerical Linear Algebra (NLA) kernels like QR Decomposition (QRD) with the added advantage of reconfigurability and scalability. While popular custom hardware solution in form of systolic arrays can deliver high performance, they are not scalable, and hence not commercially viable. In this paper, we show how systolic solutions of QRD can be realized efficiently on REDEFINE, a scalable runtime reconfigurable hardware platform. We propose various enhancements to REDEFINE to meet the custom need of accelerating NLA kernels. We further do the design space exploration of the proposed solution for any arbitrary application of size n × n. We determine the right size of the sub-array in accordance with the optimal pipeline depth of the core execution units and the number of such units to be used per sub-array.

Veja mais

Least squares QR-based decomposition provides an efficient way of computing optimal regularization parameter in photoacoustic tomography

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison. (C) 2013 Society of Photo-Optical Instrumentation Engineers (SPIE)

Veja mais

Efficient QR Decomposition Using Low Complexity Column-wise Givens Rotation (CGR)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

QR decomposition (QRD) is a widely used Numerical Linear Algebra (NLA) kernel with applications ranging from SONAR beamforming to wireless MIMO receivers. In this paper, we propose a novel Givens Rotation (GR) based QRD (GR QRD) where we reduce the computational complexity of GR and exploit higher degree of parallelism. This low complexity Column-wise GR (CGR) can annihilate multiple elements of a column of a matrix simultaneously. The algorithm is first realized on a Two-Dimensional (2 D) systolic array and then implemented on REDEFINE which is a Coarse Grained run-time Reconfigurable Architecture (CGRA). We benchmark the proposed implementation against state-of-the-art implementations to report better throughput, convergence and scalability.

Veja mais

Optimized Policies for the Retransmission Probabilities in Slotted Aloha

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we study the behaviour of the slotted Aloha multiple access scheme with a finite number of users under different traffic loads and optimize the retransmission probability q(r) for various settings, cost objectives and policies. First, we formulate the problem as a parameter optimization problem and use certain efficient smoothed functional algorithms for finding the optimal retransmission probability parameter. Next, we propose two classes of multi-level closed-loop feedback policies (for finding in each case the retransmission probability qr that now depends on the current system state) and apply the above algorithms for finding an optimal policy within each class of policies. While one of the policy classes depends on the number of backlogged nodes in the system, the other depends on the number of time slots since the last successful transmission. The latter policies are more realistic as it is difficult to keep track of the number of backlogged nodes at each instant. We investigate the effect of increasing the number of levels in the feedback policies. Wen also investigate the effects of using different cost functions (withn and without penalization) in our algorithms and the corresponding change in the throughput and delay using these. Both of our algorithms use two-timescale stochastic approximation. One of the algorithms uses one simulation while the other uses two simulations of the system. The two-simulation algorithm is seen to perform better than the other algorithm. Optimal multi-level closed-loop policies are seen to perform better than optimal open-loop policies. The performance further improves when more levels are used in the feedback policies.

Veja mais

Parallel computing concepts and methods for floquet analysis of helicopter trim and stability

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Floquet analysis is widely used for small-order systems (say, order M < 100) to find trim results of control inputs and periodic responses, and stability results of damping levels and frequencies, Presently, however, it is practical neither for design applications nor for comprehensive analysis models that lead to large systems (M > 100); the run time on a sequential computer is simply prohibitive, Accordingly, a massively parallel Floquet analysis is developed with emphasis on large systems, and it is implemented on two SIMD or single-instruction, multiple-data computers with 4096 and 8192 processors, The focus of this development is a parallel shooting method with damped Newton iteration to generate trim results; the Floquet transition matrix (FTM) comes out as a byproduct, The eigenvalues and eigenvectors of the FTM are computed by a parallel QR method, and thereby stability results are generated, For illustration, flap and flap-lag stability of isolated rotors are treated by the parallel analysis and by a corresponding sequential analysis with the conventional shooting and QR methods; linear quasisteady airfoil aerodynamics and a finite-state three-dimensional wake model are used, Computational reliability is quantified by the condition numbers of the Jacobian matrices in Newton iteration, the condition numbers of the eigenvalues and the residual errors of the eigenpairs, and reliability figures are comparable in both the parallel and sequential analyses, Compared to the sequential analysis, the parallel analysis reduces the run time of large systems dramatically, and the reduction increases with increasing system order; this finding offers considerable promise for design and comprehensive-analysis applications.

Veja mais

Solid Oxide-Ion Electrolytes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Solid oxide-ion electrolytes find application in oxygen sensors, oxygen pumps and in high-temperature electrolyser-fuel-cell hybrid systems. All the solid electrolytes known so Qr, however, exhibit: tow oxide-ion conductivities below 973 K. Therefore, there is a need for fast oxide-ion conductors operative at temperatures around 673 K, Recently, efforts have been directed towards developing such materials. This article summarizes various type of oxide-ton electrolytes reported in literature and outlines a strategy for the identificatiom/synthesis of improved materials.

Veja mais

Mechanisms of summer intraseasonal sea surface temperature oscillations in the Bay of Bengal

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Intraseasonal variations (ISV) of sea surface temperature (SST) in the Bay of Bengal (BoB) is highest in its northwestern part. An Indian Ocean model forced by QuikSCAT winds and climatological river discharge (QR run) reproduces ISV of SST, albeit with weaker magnitude. Air-sea fluxes, in the presence of a shallow mixed layer, efficiently effect intraseasonal SST fluctuations. Warming during intraseasonal events is smaller (<1°C) for June - July period and larger (1.5° to 2°C) during September, the latter due to a thinner mixed layer. To examine the effect of salinity on ISV, the model was run by artificially increasing the salinity (NORR run) and by decreasing it (MAHA10 run). In NORR, both rainfall and river discharge were switched off and in MAHA10 the discharge by river Mahanadi was increased tenfold. The spatial pattern of ISV as well as its periodicity was similar in QR, NORR and MAHA10. The ISV was stronger in NORR and weaker in MAHA10, compared to QR. In NORR, both intraseasonal warming and cooling were higher than in QR, the former due to reduced air-sea heat loss as the mean SST was lower, and the latter due to enhanced subsurface processes resulting from weaker stratification. In MAHA10, both warming and cooling were lower than in QR, the former due to higher air-sea heat loss owing to higher mean SST, and the latter due to weak subsurface processes resulting from stronger stratification. These model experiments suggest that salinity effects are crucial in determining amplitudes of intraseasonal SST variations in the BoB.

Veja mais

A LSQR-type method provides a computationally efficient automated optimal choice of regularization parameter in diffuse optical tomography

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose: Developing a computationally efficient automated method for the optimal choice of regularization parameter in diffuse optical tomography. Methods: The least-squares QR (LSQR)-type method that uses Lanczos bidiagonalization is known to be computationally efficient in performing the reconstruction procedure in diffuse optical tomography. The same is effectively deployed via an optimization procedure that uses the simplex method to find the optimal regularization parameter. The proposed LSQR-type method is compared with the traditional methods such as L-curve, generalized cross-validation (GCV), and recently proposed minimal residual method (MRM)-based choice of regularization parameter using numerical and experimental phantom data. Results: The results indicate that the proposed LSQR-type and MRM-based methods performance in terms of reconstructed image quality is similar and superior compared to L-curve and GCV-based methods. The proposed method computational complexity is at least five times lower compared to MRM-based method, making it an optimal technique. Conclusions: The LSQR-type method was able to overcome the inherent limitation of computationally expensive nature of MRM-based automated way finding the optimal regularization parameter in diffuse optical tomographic imaging, making this method more suitable to be deployed in real-time. (C) 2013 American Association of Physicists in Medicine. http://dx.doi.org/10.1118/1.4792459]

Veja mais

Construction of Block Orthogonal STBCs and Reducing Their Sphere Decoding Complexity

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Construction of high rate Space Time Block Codes (STBCs) with low decoding complexity has been studied widely using techniques such as sphere decoding and non Maximum-Likelihood (ML) decoders such as the QR decomposition decoder with M paths (QRDM decoder). Recently Ren et al., presented a new class of STBCs known as the block orthogonal STBCs (BOSTBCs), which could be exploited by the QRDM decoders to achieve significant decoding complexity reduction without performance loss. The block orthogonal property of the codes constructed was however only shown via simulations. In this paper, we give analytical proofs for the block orthogonal structure of various existing codes in literature including the codes constructed in the paper by Ren et al. We show that codes formed as the sum of Clifford Unitary Weight Designs (CUWDs) or Coordinate Interleaved Orthogonal Designs (CIODs) exhibit block orthogonal structure. We also provide new construction of block orthogonal codes from Cyclic Division Algebras (CDAs) and Crossed-Product Algebras (CPAs). In addition, we show how the block orthogonal property of the STBCs can be exploited to reduce the decoding complexity of a sphere decoder using a depth first search approach. Simulation results of the decoding complexity show a 30% reduction in the number of floating point operations (FLOPS) of BOSTBCs as compared to STBCs without the block orthogonal structure.

Veja mais

Basis pursuit deconvolution for improving model-based reconstructed images in photoacoustic tomography

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The model-based image reconstruction approaches in photoacoustic tomography have a distinct advantage compared to traditional analytical methods for cases where limited data is available. These methods typically deploy Tikhonov based regularization scheme to reconstruct the initial pressure from the boundary acoustic data. The model-resolution for these cases represents the blur induced by the regularization scheme. A method that utilizes this blurring model and performs the basis pursuit deconvolution to improve the quantitative accuracy of the reconstructed photoacoustic image is proposed and shown to be superior compared to other traditional methods via three numerical experiments. Moreover, this deconvolution including the building of an approximate blur matrix is achieved via the Lanczos bidagonalization (least-squares QR) making this approach attractive in real-time. (C) 2014 Optical Society of America

Veja mais

Hardware Accelerator for 3D Method of Moments based Parasitic Extraction

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A Field Programmable Gate Array (FPGA) based hardware accelerator for multi-conductor parasitic capacitance extraction, using Method of Moments (MoM), is presented in this paper. Due to the prohibitive cost of solving a dense algebraic system formed by MoM, linear complexity fast solver algorithms have been developed in the past to expedite the matrix-vector product computation in a Krylov sub-space based iterative solver framework. However, as the number of conductors in a system increases leading to a corresponding increase in the number of right-hand-side (RHS) vectors, the computational cost for multiple matrix-vector products present a time bottleneck, especially for ill-conditioned system matrices. In this work, an FPGA based hardware implementation is proposed to parallelize the iterative matrix solution for multiple RHS vectors in a low-rank compression based fast solver scheme. The method is applied to accelerate electrostatic parasitic capacitance extraction of multiple conductors in a Ball Grid Array (BGA) package. Speed-ups up to 13x over equivalent software implementation on an Intel Core i5 processor for dense matrix-vector products and 12x for QR compressed matrix-vector products is achieved using a Virtex-6 XC6VLX240T FPGA on Xilinx's ML605 board.

Veja mais

On Fast Bilateral Filtering Using Fourier Kernels

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It was demonstrated in earlier work that, by approximating its range kernel using shiftable functions, the nonlinear bilateral filter can be computed using a series of fast convolutions. Previous approaches based on shiftable approximation have, however, been restricted to Gaussian range kernels. In this work, we propose a novel approximation that can be applied to any range kernel, provided it has a pointwise-convergent Fourier series. More specifically, we propose to approximate the Gaussian range kernel of the bilateral filter using a Fourier basis, where the coefficients of the basis are obtained by solving a series of least-squares problems. The coefficients can be efficiently computed using a recursive form of the QR decomposition. By controlling the cardinality of the Fourier basis, we can obtain a good tradeoff between the run-time and the filtering accuracy. In particular, we are able to guarantee subpixel accuracy for the overall filtering, which is not provided by the most existing methods for fast bilateral filtering. We present simulation results to demonstrate the speed and accuracy of the proposed algorithm.

Veja mais

12 resultados para QR Microbiología

em Indian Institute of Science - Bangalore - Índia

Filtro por publicador