985 resultados para Parallel computation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bidirectional relaying, where a relay helps two user nodes to exchange equal length binary messages, has been an active area of recent research. A popular strategy involves a modified Gaussian MAC, where the relay decodes the XOR of the two messages using the naturally-occurring sum of symbols simultaneously transmitted by user nodes. In this work, we consider the Gaussian MAC in bidirectional relaying with an additional secrecy constraint for protection against a honest but curious relay. The constraint is that, while the relay should decode the XOR, it should be fully ignorant of the individual messages of the users. We exploit the symbol addition that occurs in a Gaussian MAC to design explicit strategies that achieve perfect independence between the received symbols and individual transmitted messages. Our results actually hold for a more general scenario where the messages at the two user nodes come from a finite Abelian group G, and the relay must decode the sum within G of the two messages. We provide a lattice coding strategy and study optimal rate versus average power trade-offs for asymptotically large dimensions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the classical problem of delta feature computation, and interpret the operation involved in terms of Savitzky- Golay (SG) filtering. Features such as themel-frequency cepstral coefficients (MFCCs), obtained based on short-time spectra of the speech signal, are commonly used in speech recognition tasks. In order to incorporate the dynamics of speech, auxiliary delta and delta-delta features, which are computed as temporal derivatives of the original features, are used. Typically, the delta features are computed in a smooth fashion using local least-squares (LS) polynomial fitting on each feature vector component trajectory. In the light of the original work of Savitzky and Golay, and a recent article by Schafer in IEEE Signal Processing Magazine, we interpret the dynamic feature vector computation for arbitrary derivative orders as SG filtering with a fixed impulse response. This filtering equivalence brings in significantly lower latency with no loss in accuracy, as validated by results on a TIMIT phoneme recognition task. The SG filters involved in dynamic parameter computation can be viewed as modulation filters, proposed by Hermansky.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Decoherence as an obstacle in quantum computation is viewed as a struggle between two forces [1]: the computation which uses the exponential dimension of Hilbert space, and decoherence which destroys this entanglement by collapse. In this model of decohered quantum computation, a sequential quantum computer loses the battle, because at each time step, only a local operation is carried out but g*(t) number of gates collapse. With quantum circuits computing in parallel way the situation is different- g(t) number of gates can be applied at each time step and number gates collapse because of decoherence. As g(t) ≈ g*(t) competition here is even [1]. Our paper improves on this model by slowing down g*(t) by encoding the circuit in parallel computing architectures and running it in Single Instruction Multiple Data (SIMD) paradigm. We have proposed a parallel ion trap architecture for single-bit rotation of a qubit.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for LVCSR systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication.In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on a 1138 word vocabulary RM1 task using Sphinx 3.7 system show that, for a typical case the matrix multiplication approach leads to overall speedup of 46%. Both the low-rank approximation methods increase the speedup to around 60%, with the former method increasing the word error rate (WER) from 3.2% to 6.6%, while the latter increases the WER from 3.2% to 3.5%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider a distributed function computation setting, where there are m distributed but correlated sources X1,...,Xm and a receiver interested in computing an s-dimensional subspace generated by [X1,...,Xm]Γ for some (m × s) matrix Γ of rank s. We construct a scheme based on nested linear codes and characterize the achievable rates obtained using the scheme. The proposed nested-linear-code approach performs at least as well as the Slepian-Wolf scheme in terms of sum-rate performance for all subspaces and source distributions. In addition, for a large class of distributions and subspaces, the scheme improves upon the Slepian-Wolf approach. The nested-linear-code scheme may be viewed as uniting under a common framework, both the Korner-Marton approach of using a common linear encoder as well as the Slepian-Wolf approach of employing different encoders at each source. Along the way, we prove an interesting and fundamental structural result on the nature of subspaces of an m-dimensional vector space V with respect to a normalized measure of entropy. Here, each element in V corresponds to a distinct linear combination of a set {Xi}im=1 of m random variables whose joint probability distribution function is given.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identical parallel-connected converters with unequal load sharing have unequal terminal voltages. The difference in terminal voltages is more pronounced in case of back-to-back connected converters, operated in power-circulation mode for the purpose of endurance tests. In this paper, a synchronous reference frame based analysis is presented to estimate the grid current distortion in interleaved, grid-connected converters with unequal terminal voltages. Influence of carrier interleaving angle on rms grid current ripple is studied theoretically as well as experimentally. Optimum interleaving angle to minimize the rms grid current ripple is investigated for different applications of parallel converters. The applications include unity power factor rectifiers, inverters for renewable energy sources, reactive power compensators, and circulating-power test set-up used for thermal testing of high-power converters. Optimum interleaving angle is shown to be a strong function of the average of the modulation indices of the two converters, irrespective of the application. The findings are verified experimentally on two parallel-connected converters, circulating reactive power of up to 150 kVA between them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Let X-1,..., X-m be a set of m statistically dependent sources over the common alphabet F-q, that are linearly independent when considered as functions over the sample space. We consider a distributed function computation setting in which the receiver is interested in the lossless computation of the elements of an s-dimensional subspace W spanned by the elements of the row vector X-1,..., X-m]Gamma in which the (m x s) matrix Gamma has rank s. A sequence of three increasingly refined approaches is presented, all based on linear encoders. The first approach uses a common matrix to encode all the sources and a Korner-Marton like receiver to directly compute W. The second improves upon the first by showing that it is often more efficient to compute a carefully chosen superspace U of W. The superspace is identified by showing that the joint distribution of the {X-i} induces a unique decomposition of the set of all linear combinations of the {X-i}, into a chain of subspaces identified by a normalized measure of entropy. This subspace chain also suggests a third approach, one that employs nested codes. For any joint distribution of the {X-i} and any W, the sum-rate of the nested code approach is no larger than that under the Slepian-Wolf (SW) approach. Under the SW approach, W is computed by first recovering each of the {X-i}. For a large class of joint distributions and subspaces W, the nested code approach is shown to improve upon SW. Additionally, a class of source distributions and subspaces are identified, for which the nested-code approach is sum-rate optimal.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that every graph of maximum degree 3 can be represented as the intersection graph of axis parallel boxes in three dimensions, that is, every vertex can be mapped to an axis parallel box such that two boxes intersect if and only if their corresponding vertices are adjacent. In fact, we construct a representation in which any two intersecting boxes just touch at their boundaries. Further, this construction can be realized in linear time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we study the diversity-multiplexing-gain tradeoff (DMT) of wireless relay networks under the half-duplex constraint. It is often unclear what penalty if any, is imposed by the half-duplex constraint on the DMT of such networks. We study two classes of networks; the first class, called KPP(I) networks, is the class of networks with the relays organized in K parallel paths between the source and the destination. While we assume that there is no direct source-destination path, the K relaying paths can interfere with each other. The second class, termed as layered networks, is comprised of relays organized in layers, where links exist only between adjacent layers. We present a communication scheme based on static schedules and amplify-and-forward relaying for these networks. We also show that for KPP(I) networks with K >= 3, the proposed schemes can achieve full-duplex DMT performance, thus demonstrating that there is no performance hit on the DMT due to the half-duplex constraint. We also show that, for layered networks, a linear DMT of d(max)(1 - r)(+) between the maximum diversity d(max) and the maximum MG, r(max) = 1 is achievable. We adapt existing DMT optimal coding schemes to these networks, thus specifying the end-to-end communication strategy explicitly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The contour tree is a topological abstraction of a scalar field that captures evolution in level set connectivity. It is an effective representation for visual exploration and analysis of scientific data. We describe a work-efficient, output sensitive, and scalable parallel algorithm for computing the contour tree of a scalar field defined on a domain that is represented using either an unstructured mesh or a structured grid. A hybrid implementation of the algorithm using the GPU and multi-core CPU can compute the contour tree of an input containing 16 million vertices in less than ten seconds with a speedup factor of upto 13. Experiments based on an implementation in a multi-core CPU environment show near-linear speedup for large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nearly pollution-free solutions of the Helmholtz equation for k-values corresponding to visible light are demonstrated and verified through experimentally measured forward scattered intensity from an optical fiber. Numerically accurate solutions are, in particular, obtained through a novel reformulation of the H-1 optimal Petrov-Galerkin weak form of the Helmholtz equation. Specifically, within a globally smooth polynomial reproducing framework, the compact and smooth test functions are so designed that their normal derivatives are zero everywhere on the local boundaries of their compact supports. This circumvents the need for a priori knowledge of the true solution on the support boundary and relieves the weak form of any jump boundary terms. For numerical demonstration of the above formulation, we used a multimode optical fiber in an index matching liquid as the object. The scattered intensity and its normal derivative are computed from the scattered field obtained by solving the Helmholtz equation, using the new formulation and the conventional finite element method. By comparing the results with the experimentally measured scattered intensity, the stability of the solution through the new formulation is demonstrated and its closeness to the experimental measurements verified.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With proliferation of chip multicores (CMPs) on desktops and embedded platforms, multi-threaded programs have become ubiquitous. Existence of multiple threads may cause resource contention, such as, in on-chip shared cache and interconnects, depending upon how they access resources. Hence, we propose a tool - Thread Contention Predictor (TCP) to help quantify the number of threads sharing data and their sharing pattern. We demonstrate its use to predict a more profitable shared, last level on-chip cache (LLC) access policy on CMPs. Our cache configuration predictor is 2.2 times faster compared to the cycle-accurate simulations. We also demonstrate its use for identifying hot data structures in a program which may cause performance degradation due to false data sharing. We fix layout of such data structures and show up-to 10% and 18% improvement in execution time and energy-delay product (EDP), respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a study of the nature of the degrees-of-freedom of spatial manipulators based on the concept of partition of degrees-of-freedom. In particular, the partitioning of degrees-of-freedom is studied in five lower-mobility spatial parallel manipulators possessing different combinations of degrees-of-freedom. An extension of the existing theory is introduced so as to analyse the nature of the gained degree(s)-of-freedom at a gain-type singularity. The gain of one- and two-degrees-of-freedom is analysed in several well-studied, as well as newly developed manipulators. The formulations also present a basis for the analysis of the velocity kinematics of manipulators of any architecture. (C) 2013 Elsevier Ltd. All rights reserved.