925 resultados para Algorithm clustering
Resumo:
With the advent of VLSI it has become possible to map parallel algorithms for compute-bound problems directly on silicon. Systolic architecture is very good candidate for VLSI implementation because of its regular and simple design, and regular communication pattern. In this paper, a systolic algorithm and corresponding systolic architecture, a linear systolic array, for the scanline-based hidden surface removal problem in three-dimensional computer graphics have been proposed. The algorithm is based on the concept of sample spans or intervals. The worst case time taken by the algorithm is O(n), n being the number of segments in a scanline. The time taken by the algorithm for a given scene depends on the scene itself, and on an average considerable improvement over the worst case behaviour is expected. A pipeline scheme for handling the I/O process has also been proposed which is suitable for VLSI implementation of the algorithm.
Resumo:
We present a low-complexity algorithm based on reactive tabu search (RTS) for near maximum likelihood (ML) detection in large-MIMO systems. The conventional RTS algorithm achieves near-ML performance for 4-QAM in large-MIMO systems. But its performance for higher-order QAM is far from ML performance. Here, we propose a random-restart RTS (R3TS) algorithm which achieves significantly better bit error rate (BER) performance compared to that of the conventional RTS algorithm in higher-order QAM. The key idea is to run multiple tabu searches, each search starting with a random initial vector and choosing the best among the resulting solution vectors. A criterion to limit the number of searches is also proposed. Computer simulations show that the R3TS algorithm achieves almost the ML performance in 16 x 16 V-BLAST MIMO system with 16-QAM and 64-QAM at significantly less complexities than the sphere decoder. Also, in a 32 x 32 V-BLAST MIMO system, the R3TS performs close to ML lower bound within 1.6 dB for 16-QAM (128 bps/Hz), and within 2.4 dB for 64-QAM (192 bps/Hz) at 10(-3) BER.
Resumo:
A new fast and efficient marching algorithm is introduced to solve the basic quasilinear, hyperbolic partial differential equations describing unsteady, flow in conduits by the method of characteristics. The details of the marching method are presented with an illustration of the waterhammer problem in a simple piping system both for friction and frictionless cases. It is shown that for the same accuracy the new marching method requires fewer computational steps, less computer memory and time.
Resumo:
A simple and efficient algorithm for the bandwidth reduction of sparse symmetric matrices is proposed. It involves column-row permutations and is well-suited to map onto the linear array topology of the SIMD architectures. The efficiency of the algorithm is compared with the other existing algorithms. The interconnectivity and the memory requirement of the linear array are discussed and the complexity of its layout area is derived. The parallel version of the algorithm mapped onto the linear array is then introduced and is explained with the help of an example. The optimality of the parallel algorithm is proved by deriving the time complexities of the algorithm on a single processor and the linear array.
Resumo:
For the specific case of binary stars, this paper presents signal-to-noise ratio (SNR) calculations for the detection of the parity (the side of the brighter component) of the binary using the double correlation method. This double correlation method is a focal plane version of the well-known Knox-Thompson method used in speckle interferometry. It is shown that SNR for parity detection using double correlation depends linearly on binary separation. This new result was entirely missed by previous analytical calculations dealing with a point source. It is concluded that, for magnitudes relevant to the present day speckle interferometry and for binary separations close to the diffraction limit, speckle masking has better SNR for parity detection.
Resumo:
In this paper we develop a multithreaded VLSI processor linear array architecture to render complex environments based on the radiosity approach. The processing elements are identical and multithreaded. They work in Single Program Multiple Data (SPMD) mode. A new algorithm to do the radiosity computations based on the progressive refinement approach[2] is proposed. Simulation results indicate that the architecture is latency tolerant and scalable. It is shown that a linear array of 128 uni-threaded processing elements sustains a throughput close to 0.4 million patches/sec.
Resumo:
Presented here, in a vector formulation, is an O(mn2) direct concise algorithm that prunes/identifies the linearly dependent (ld) rows of an arbitrary m X n matrix A and computes its reflexive type minimum norm inverse A(mr)-, which will be the true inverse A-1 if A is nonsingular and the Moore-Penrose inverse A+ if A is full row-rank. The algorithm, without any additional computation, produces the projection operator P = (I - A(mr)- A) that provides a means to compute any of the solutions of the consistent linear equation Ax = b since the general solution may be expressed as x = A(mr)+b + Pz, where z is an arbitrary vector. The rank r of A will also be produced in the process. Some of the salient features of this algorithm are that (i) the algorithm is concise, (ii) the minimum norm least squares solution for consistent/inconsistent equations is readily computable when A is full row-rank (else, a minimum norm solution for consistent equations is obtainable), (iii) the algorithm identifies ld rows, if any, and reduces concerned computation and improves accuracy of the result, (iv) error-bounds for the inverse as well as the solution x for Ax = b are readily computable, (v) error-free computation of the inverse, solution vector, rank, and projection operator and its inherent parallel implementation are straightforward, (vi) it is suitable for vector (pipeline) machines, and (vii) the inverse produced by the algorithm can be used to solve under-/overdetermined linear systems.
Resumo:
We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The source localization algorithms in the earlier works, mostly used non-planar arrays. If we consider scenarios like human-computer communication, or human-television communication where the microphones need to be placed on the computer monitor or television front panel, i.e we need to use the planar arrays. The algorithm proposed in 1], is a Linear Closed Form source localization algorithm (LCF algorithm) which is based on Time Difference of Arrivals (TDOAs) that are obtained from the data collected using the microphones. It assumes non-planar arrays. The LCF algorithm is applied to planar arrays in the current work. The relationship between the error in the source location estimate and the perturbation in the TDOAs is derived using first order perturbation analysis and validated using simulations. If the TDOAs are erroneous, both the coefficient matrix and the data matrix used for obtaining source location will be perturbed. So, the Total least squares solution for source localization is proposed in the current work. The sensitivity analysis of the source localization algorithm for planar arrays and non-planar arrays is done by introducing perturbation in the TDOAs and the microphone locations. It is shown that the error in the source location estimate is less when we use planar array instead of the particular non-planar array considered for same perturbation in the TDOAs or microphone location. The location of the reference microphone is proved to be important for getting an accurate source location estimate if we are using the LCF algorithm.
Resumo:
The aim of this paper is to develop a computationally efficient decentralized rendezvous algorithm for a group of autonomous agents. The algorithm generalizes the notion of sensor domain and decision domain of agents to enable implementation of simple computational algorithms. Specifically, the algorithm proposed in this paper uses a rectilinear decision domain (RDD) as against the circular decision domain assumed in earlier work. Because of this, the computational complexity of the algorithm reduces considerably and, when compared to the standard Ando's algorithm available in the literature, the RDD algorithm shows very significant improvement in convergence time performance. Analytical results to prove convergence and supporting simulation results are presented in the paper.
Resumo:
In this paper, we are concerned with low-complexity detection in large multiple-input multiple-output (MIMO) systems with tens of transmit/receive antennas. Our new contributions in this paper are two-fold. First, we propose a low-complexity algorithm for large-MIMO detection based on a layered low-complexity local neighborhood search. Second, we obtain a lower bound on the maximum-likelihood (ML) bit error performance using the local neighborhood search. The advantages of the proposed ML lower bound are i) it is easily obtained for MIMO systems with large number of antennas because of the inherent low complexity of the search algorithm, ii) it is tight at moderate-to-high SNRs, and iii) it can be tightened at low SNRs by increasing the number of symbols in the neighborhood definition. Interestingly, the proposed detection algorithm based on the layered local search achieves bit error performances which are quite close to this lower bound for large number of antennas and higher-order QAM. For e. g., in a 32 x 32 V-BLAST MIMO system, the proposed detection algorithm performs close to within 1.7 dB of the proposed ML lower bound at 10(-3) BER for 16-QAM (128 bps/Hz), and close to within 4.5 dB of the bound for 64-QAM (192 bps/Hz).
Resumo:
In this paper a pipelined ring algorithm is presented for efficient computation of one and two dimensional Fast Fourier Transform (FFT) on a message passing multiprocessor. The algorithm has been implemented on a transputer based system and experiments reveal that the algorithm is very efficient. A model for analysing the performance of the algorithm is developed from its computation-communication characteristics. Expressions for execution time, speedup and efficiency are obtained and these expressions are validated with experimental results obtained on a four transputer system. The analytical model is then used to estimate the performance of the algorithm for different number of processors, and for different sizes of the input data.
Resumo:
Here we rederive the hierarchy of equations for the evolution of distribution functions of various orders using a convenient parameterization. We use this to obtain equations for two- and three-point correlation functions in powers of a small parameter, viz., the initial density contrast. The correspondence of the lowest order solutions of these equations to the results from the linear theory of density perturbations is shown for an OMEGA = 1 universe. These equations are then used to calculate, to the lowest order, the induced three-point correlation function that arises from Gaussian initial conditions in an OMEGA = 1 universe. We obtain an expression which explicitly exhibits the spatial structure of the induced three-point correlation function. It is seen that the spatial structure of this quantity is independent of the value of OMEGA. We also calculate the triplet momentum. We find that the induced three-point correlation function does not have the ''hierarchical'' form often assumed. We discuss possibilities of using the induced three-point correlation to interpret observational data. The formalism developed here can also be used to test a validity of different schemes to close the
Resumo:
Presented here is a stable algorithm that uses Zohar's formulation of Trench's algorithm and computes the inverse of a symmetric Toeplitz matrix including those with vanishing or nearvanishing leading minors. The algorithm is based on a diagonal modification of the matrix, and exploits symmetry and persymmetry properties of the inverse matrix.
Resumo:
We address the problem of designing codes for specific applications using deterministic annealing. Designing a block code over any finite dimensional space may be thought of as forming the corresponding number of clusters over the particular dimensional space. We have shown that the total distortion incurred in encoding a training set is related to the probability of correct reception over a symmetric channel. While conventional deterministic annealing make use of the Euclidean squared error distance measure, we have developed an algorithm that can be used for clustering with Hamming distance as the distance measure, which is required in the error correcting, scenario.