69 resultados para LDPC, CUDA, GPGPU, computing, GPU, DVB, S2, SDR


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we explore the use of LDPC codes for nonuniform sources under distributed source coding paradigm. Our analysis reveals that several capacity approaching LDPC codes indeed do approach the Slepian-Wolf bound for nonuniform sources as well. The Monte Carlo simulation results show that highly biased sources can be compressed to 0.049 bits/sample away from Slepian-Wolf bound for moderate block lengths.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information forms the basis of modern technology. To meet the ever-increasing demand for information, means have to be devised for a more efficient and better-equipped technology to intelligibly process data. Advances in photonics have made their impact on each of the four key applications in information processing, i.e., acquisition, transmission, storage and processing of information. The inherent advantages of ultrahigh bandwidth, high speed and low-loss transmission has already established fiber-optics as the backbone of communication technology. However, the optics to electronics inter-conversion at the transmitter and receiver ends severely limits both the speed and bit rate of lightwave communication systems. As the trend towards still faster and higher capacity systems continues, it has become increasingly necessary to perform more and more signal-processing operations in the optical domain itself, i.e., with all-optical components and devices that possess a high bandwidth and can perform parallel processing functions to eliminate the electronic bottleneck.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fragment Finder 2.0 is a web-based interactive computing server which can be used to retrieve structurally similar protein fragments from 25 and 90% nonredundant data sets. The computing server identifies structurally similar fragments using the protein backbone C alpha angles. In addition, the identified fragments can be superimposed using either of the two structural superposition programs, STAMP and PROFIT, provided in the server. The freely available Java plug-in Jmol has been interfaced with the server for the visualization of the query and superposed fragments. The server is the updated version of a previously developed search engine and employs an in-house-developed fast pattern matching algorithm. This server can be accessed freely over the World Wide Web through the URL http://cluster.physics.iisc.ernet.in/ff/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Carbon nanotubes dispersed in polymer matrix have been aligned in the form of fibers and interconnects and cured electrically and by UV light. Conductivity and effective semiconductor tunneling against reverse to forward bias field have been designed to have differentiable current-voltage response of each of the fiber/channel. The current-voltage response is a function of the strain applied to the fibers along axial direction. Biaxial and shear strains are correlated by differentiating signals from the aligned fibers/channels. Using a small doping of magnetic nanoparticles in these composite fibers, magneto-resistance properties are realized which are strong enough to use the resulting magnetostriction as a state variable for signal processing and computing. Various basic analog signal processing tasks such as addition, convolution and filtering etc. can be performed. These preliminary study shows promising application of the concept in combined analog-digital computation in carbon nanotube based fibers. Various dynamic effects such as relaxation, electric field dependent nonlinearities and hysteresis on the output signals are studied using experimental data and analytical model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Morse-Smale complex is a topological structure that captures the behavior of the gradient of a scalar function on a manifold. This paper discusses scalable techniques to compute the Morse-Smale complex of scalar functions defined on large three-dimensional structured grids. Computing the Morse-Smale complex of three-dimensional domains is challenging as compared to two-dimensional domains because of the non-trivial structure introduced by the two types of saddle criticalities. We present a parallel shared-memory algorithm to compute the Morse-Smale complex based on Forman's discrete Morse theory. The algorithm achieves scalability via synergistic use of the CPU and the GPU. We first prove that the discrete gradient on the domain can be computed independently for each cell and hence can be implemented on the GPU. Second, we describe a two-step graph traversal algorithm to compute the 1-saddle-2-saddle connections efficiently and in parallel on the CPU. Simultaneously, the extremasaddle connections are computed using a tree traversal algorithm on the GPU.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-time image reconstruction is essential for improving the temporal resolution of fluorescence microscopy. A number of unavoidable processes such as, optical aberration, noise and scattering degrade image quality, thereby making image reconstruction an ill-posed problem. Maximum likelihood is an attractive technique for data reconstruction especially when the problem is ill-posed. Iterative nature of the maximum likelihood technique eludes real-time imaging. Here we propose and demonstrate a compute unified device architecture (CUDA) based fast computing engine for real-time 3D fluorescence imaging. A maximum performance boost of 210x is reported. Easy availability of powerful computing engines is a boon and may accelerate to realize real-time 3D fluorescence imaging. Copyright 2012 Author(s). This article is distributed under a Creative Commons Attribution 3.0 Unported License. http://dx.doi.org/10.1063/1.4754604]

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we employ message passing algorithms over graphical models to jointly detect and decode symbols transmitted over large multiple-input multiple-output (MIMO) channels with low density parity check (LDPC) coded bits. We adopt a factor graph based technique to integrate the detection and decoding operations. A Gaussian approximation of spatial interference is used for detection. This serves as a low complexity joint detection/decoding approach for large dimensional MIMO systems coded with LDPC codes of large block lengths. This joint processing achieves significantly better performance than the individual detection and decoding scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Reeb graph of a scalar function tracks the evolution of the topology of its level sets. This paper describes a fast algorithm to compute the Reeb graph of a piecewise-linear (PL) function defined over manifolds and non-manifolds. The key idea in the proposed approach is to maximally leverage the efficient contour tree algorithm to compute the Reeb graph. The algorithm proceeds by dividing the input into a set of subvolumes that have loop-free Reeb graphs using the join tree of the scalar function and computes the Reeb graph by combining the contour trees of all the subvolumes. Since the key ingredient of this method is a series of union-find operations, the algorithm is fast in practice. Experimental results demonstrate that it outperforms current generic algorithms by a factor of up to two orders of magnitude, and has a performance on par with algorithms that are catered to restricted classes of input. The algorithm also extends to handle large data that do not fit in memory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Low density parity-check (LDPC) codes are a class of linear block codes that are decoded by running belief propagation (BP) algorithm or log-likelihood ratio belief propagation (LLR-BP) over the factor graph of the code. One of the disadvantages of LDPC codes is the onset of an error floor at high values of signal to noise ratio caused by trapping sets. In this paper, we propose a two stage decoder to deal with different types of trapping sets. Oscillating trapping sets are taken care by the first stage of the decoder and the elementary trapping sets are handled by the second stage of the decoder. Simulation results on the regular PEG (504,252,3,6) code and the irregular PEG (1024,518,15,8) code shows that the proposed two stage decoder performs significantly better than the standard decoder.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this letter, we characterize the extrinsic information transfer (EXIT) behavior of a factor graph based message passing algorithm for detection in large multiple-input multiple-output (MIMO) systems with tens to hundreds of antennas. The EXIT curves of a joint detection-decoding receiver are obtained for low density parity check (LDPC) codes of given degree distributions. From the obtained EXIT curves, an optimization of the LDPC code degree profiles is carried out to design irregular LDPC codes matched to the large-MIMO channel and joint message passing receiver. With low complexity joint detection-decoding, these codes are shown to perform better than off-the-shelf irregular codes in the literature by about 1 to 1.5 dB at a coded BER of 10(-5) in 16 x 16, 64 x 64 and 256 x 256 MIMO systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison. (C) 2013 Society of Photo-Optical Instrumentation Engineers (SPIE)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Delaunay and Gabriel graphs are widely studied geo-metric proximity structures. Motivated by applications in wireless routing, relaxed versions of these graphs known as Locally Delaunay Graphs (LDGs) and Lo-cally Gabriel Graphs (LGGs) have been proposed. We propose another generalization of LGGs called Gener-alized Locally Gabriel Graphs (GLGGs) in the context when certain edges are forbidden in the graph. Unlike a Gabriel Graph, there is no unique LGG or GLGG for a given point set because no edge is necessarily in-cluded or excluded. This property allows us to choose an LGG/GLGG that optimizes a parameter of interest in the graph. We show that computing an edge max-imum GLGG for a given problem instance is NP-hard and also APX-hard. We also show that computing an LGG on a given point set with dilation ≤k is NP-hard. Finally, we give an algorithm to verify whether a given geometric graph G= (V, E) is a valid LGG.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The amplitude-modulation (AM) and phase-modulation (PM) of an amplitude-modulated frequency-modulated (AM-FM) signal are defined as the modulus and phase angle, respectively, of the analytic signal (AS). The FM is defined as the derivative of the PM. However, this standard definition results in a PM with jump discontinuities in cases when the AM index exceeds unity, resulting in an FM that contains impulses. We propose a new approach to define smooth AM, PM, and FM for the AS, where the PM is computed as the solution to an optimization problem based on a vector interpretation of the AS. Our approach is directly linked to the fractional Hilbert transform (FrHT) and leads to an eigenvalue problem. The resulting PM and AM are shown to be smooth, and in particular, the AM turns out to be bipolar. We show an equivalence of the eigenvalue formulation to the square of the AS, and arrive at a simple method to compute the smooth PM. Some examples on synthesized and real signals are provided to validate the theoretical calculations.