226 resultados para VGS algorithm
Resumo:
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is proposed. It involves column-row permutations and is well-suited to map onto the linear array topology of the SIMD architectures. The efficiency of the algorithm is compared with the other existing algorithms. The interconnectivity and the memory requirement of the linear array are discussed and the complexity of its layout area is derived. The parallel version of the algorithm mapped onto the linear array is then introduced and is explained with the help of an example. The optimality of the parallel algorithm is proved by deriving the time complexities of the algorithm on a single processor and the linear array.
Resumo:
For the specific case of binary stars, this paper presents signal-to-noise ratio (SNR) calculations for the detection of the parity (the side of the brighter component) of the binary using the double correlation method. This double correlation method is a focal plane version of the well-known Knox-Thompson method used in speckle interferometry. It is shown that SNR for parity detection using double correlation depends linearly on binary separation. This new result was entirely missed by previous analytical calculations dealing with a point source. It is concluded that, for magnitudes relevant to the present day speckle interferometry and for binary separations close to the diffraction limit, speckle masking has better SNR for parity detection.
Resumo:
The K-means algorithm for clustering is very much dependent on the initial seed values. We use a genetic algorithm to find a near-optimal partitioning of the given data set by selecting proper initial seed values in the K-means algorithm. Results obtained are very encouraging and in most of the cases, on data sets having well separated clusters, the proposed scheme reached a global minimum.
Resumo:
In this paper we develop a multithreaded VLSI processor linear array architecture to render complex environments based on the radiosity approach. The processing elements are identical and multithreaded. They work in Single Program Multiple Data (SPMD) mode. A new algorithm to do the radiosity computations based on the progressive refinement approach[2] is proposed. Simulation results indicate that the architecture is latency tolerant and scalable. It is shown that a linear array of 128 uni-threaded processing elements sustains a throughput close to 0.4 million patches/sec.
Resumo:
Presented here, in a vector formulation, is an O(mn2) direct concise algorithm that prunes/identifies the linearly dependent (ld) rows of an arbitrary m X n matrix A and computes its reflexive type minimum norm inverse A(mr)-, which will be the true inverse A-1 if A is nonsingular and the Moore-Penrose inverse A+ if A is full row-rank. The algorithm, without any additional computation, produces the projection operator P = (I - A(mr)- A) that provides a means to compute any of the solutions of the consistent linear equation Ax = b since the general solution may be expressed as x = A(mr)+b + Pz, where z is an arbitrary vector. The rank r of A will also be produced in the process. Some of the salient features of this algorithm are that (i) the algorithm is concise, (ii) the minimum norm least squares solution for consistent/inconsistent equations is readily computable when A is full row-rank (else, a minimum norm solution for consistent equations is obtainable), (iii) the algorithm identifies ld rows, if any, and reduces concerned computation and improves accuracy of the result, (iv) error-bounds for the inverse as well as the solution x for Ax = b are readily computable, (v) error-free computation of the inverse, solution vector, rank, and projection operator and its inherent parallel implementation are straightforward, (vi) it is suitable for vector (pipeline) machines, and (vii) the inverse produced by the algorithm can be used to solve under-/overdetermined linear systems.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The source localization algorithms in the earlier works, mostly used non-planar arrays. If we consider scenarios like human-computer communication, or human-television communication where the microphones need to be placed on the computer monitor or television front panel, i.e we need to use the planar arrays. The algorithm proposed in 1], is a Linear Closed Form source localization algorithm (LCF algorithm) which is based on Time Difference of Arrivals (TDOAs) that are obtained from the data collected using the microphones. It assumes non-planar arrays. The LCF algorithm is applied to planar arrays in the current work. The relationship between the error in the source location estimate and the perturbation in the TDOAs is derived using first order perturbation analysis and validated using simulations. If the TDOAs are erroneous, both the coefficient matrix and the data matrix used for obtaining source location will be perturbed. So, the Total least squares solution for source localization is proposed in the current work. The sensitivity analysis of the source localization algorithm for planar arrays and non-planar arrays is done by introducing perturbation in the TDOAs and the microphone locations. It is shown that the error in the source location estimate is less when we use planar array instead of the particular non-planar array considered for same perturbation in the TDOAs or microphone location. The location of the reference microphone is proved to be important for getting an accurate source location estimate if we are using the LCF algorithm.
Resumo:
The aim of this paper is to develop a computationally efficient decentralized rendezvous algorithm for a group of autonomous agents. The algorithm generalizes the notion of sensor domain and decision domain of agents to enable implementation of simple computational algorithms. Specifically, the algorithm proposed in this paper uses a rectilinear decision domain (RDD) as against the circular decision domain assumed in earlier work. Because of this, the computational complexity of the algorithm reduces considerably and, when compared to the standard Ando's algorithm available in the literature, the RDD algorithm shows very significant improvement in convergence time performance. Analytical results to prove convergence and supporting simulation results are presented in the paper.
Resumo:
In this paper, we are concerned with low-complexity detection in large multiple-input multiple-output (MIMO) systems with tens of transmit/receive antennas. Our new contributions in this paper are two-fold. First, we propose a low-complexity algorithm for large-MIMO detection based on a layered low-complexity local neighborhood search. Second, we obtain a lower bound on the maximum-likelihood (ML) bit error performance using the local neighborhood search. The advantages of the proposed ML lower bound are i) it is easily obtained for MIMO systems with large number of antennas because of the inherent low complexity of the search algorithm, ii) it is tight at moderate-to-high SNRs, and iii) it can be tightened at low SNRs by increasing the number of symbols in the neighborhood definition. Interestingly, the proposed detection algorithm based on the layered local search achieves bit error performances which are quite close to this lower bound for large number of antennas and higher-order QAM. For e. g., in a 32 x 32 V-BLAST MIMO system, the proposed detection algorithm performs close to within 1.7 dB of the proposed ML lower bound at 10(-3) BER for 16-QAM (128 bps/Hz), and close to within 4.5 dB of the bound for 64-QAM (192 bps/Hz).
Resumo:
In this paper a pipelined ring algorithm is presented for efficient computation of one and two dimensional Fast Fourier Transform (FFT) on a message passing multiprocessor. The algorithm has been implemented on a transputer based system and experiments reveal that the algorithm is very efficient. A model for analysing the performance of the algorithm is developed from its computation-communication characteristics. Expressions for execution time, speedup and efficiency are obtained and these expressions are validated with experimental results obtained on a four transputer system. The analytical model is then used to estimate the performance of the algorithm for different number of processors, and for different sizes of the input data.
Resumo:
Presented here is a stable algorithm that uses Zohar's formulation of Trench's algorithm and computes the inverse of a symmetric Toeplitz matrix including those with vanishing or nearvanishing leading minors. The algorithm is based on a diagonal modification of the matrix, and exploits symmetry and persymmetry properties of the inverse matrix.
Resumo:
This paper presents a fast algorithm for data exchange in a network of processors organized as a reconfigurable tree structure. For a given data exchange table, the algorithm generates a sequence of tree configurations in which the data exchanges are to be executed. A significant feature of the algorithm is that each exchange is executed in a tree configuration in which the source and destination nodes are adjacent to each other. It has been proved in a theorem that for every pair of nodes in the reconfigurable tree structure, there always exists two and only two configurations in which these two nodes are adjacent to each other. The algorithm utilizes this fact and determines the solution so as to optimize both the number of configurations required and the time to perform the data exchanges. Analysis of the algorithm shows that it has linear time complexity, and provides a large reduction in run-time as compared to a previously proposed algorithm. This is well-confirmed from the experimental results obtained by executing a large number of randomly-generated data exchange tables. Another significant feature of the algorithm is that the bit-size of the routing information code is always two bits, irrespective of the number of nodes in the tree. This not only increases the speed of the algorithm but also results in simpler hardware inside each node.
Resumo:
A parallel matrix multiplication algorithm is presented, and studies of its performance and estimation are discussed. The algorithm is implemented on a network of transputers connected in a ring topology. An efficient scheme for partitioning the input matrices is introduced which enables overlapping computation with communication. This makes the algorithm achieve near-ideal speed-up for reasonably large matrices. Analytical expressions for the execution time of the algorithm have been derived by analysing its computation and communication characteristics. These expressions are validated by comparing the theoretical results of the performance with the experimental values obtained on a four-transputer network for both square and irregular matrices. The analytical model is also used to estimate the performance of the algorithm for a varying number of transputers and varying problem sizes. Although the algorithm is implemented on transputers, the methodology and the partitioning scheme presented in this paper are quite general and can be implemented on other processors which have the capability of overlapping computation with communication. The equations for performance prediction can also be extended to other multiprocessor systems.
Resumo:
Although the recently proposed single-implicit-equation-based input voltage equations (IVEs) for the independent double-gate (IDG) MOSFET promise faster computation time than the earlier proposed coupled-equations-based IVEs, it is not clear how those equations could be solved inside a circuit simulator as the conventional Newton-Raphson (NR)-based root finding method will not always converge due to the presence of discontinuity at the G-zero point (GZP) and nonremovable singularities in the trigonometric IVE. In this paper, we propose a unique algorithm to solve those IVEs, which combines the Ridders algorithm with the NR-based technique in order to provide assured convergence for any bias conditions. Studying the IDG MOSFET operation carefully, we apply an optimized initial guess to the NR component and a minimized solution space to the Ridders component in order to achieve rapid convergence, which is very important for circuit simulation. To reduce the computation budget further, we propose a new closed-form solution of the IVEs in the near vicinity of the GZP. The proposed algorithm is tested with different device parameters in the extended range of bias conditions and successfully implemented in a commercial circuit simulator through its Verilog-A interface.
Resumo:
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time Scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.