68 resultados para symbolic computation
Resumo:
The Morse-Smale complex is a topological structure that captures the behavior of the gradient of a scalar function on a manifold. This paper discusses scalable techniques to compute the Morse-Smale complex of scalar functions defined on large three-dimensional structured grids. Computing the Morse-Smale complex of three-dimensional domains is challenging as compared to two-dimensional domains because of the non-trivial structure introduced by the two types of saddle criticalities. We present a parallel shared-memory algorithm to compute the Morse-Smale complex based on Forman's discrete Morse theory. The algorithm achieves scalability via synergistic use of the CPU and the GPU. We first prove that the discrete gradient on the domain can be computed independently for each cell and hence can be implemented on the GPU. Second, we describe a two-step graph traversal algorithm to compute the 1-saddle-2-saddle connections efficiently and in parallel on the CPU. Simultaneously, the extremasaddle connections are computed using a tree traversal algorithm on the GPU.
Resumo:
The experimental implementation of a quantum algorithm requires the decomposition of unitary operators. Here we treat unitary-operator decomposition as an optimization problem, and use a genetic algorithm-a global-optimization method inspired by nature's evolutionary process-for operator decomposition. We apply this method to NMR quantum information processing, and find a probabilistic way of performing universal quantum computation using global hard pulses. We also demonstrate the efficient creation of the singlet state (a special type of Bell state) directly from thermal equilibrium, using an optimum sequence of pulses. © 2012 American Physical Society.
Resumo:
The experimental implementation of a quantum algorithm requires the decomposition of unitary operators. Here we treat unitary-operator decomposition as an optimization problem, and use a genetic algorithm-a global-optimization method inspired by nature's evolutionary process-for operator decomposition. We apply this method to NMR quantum information processing, and find a probabilistic way of performing universal quantum computation using global hard pulses. We also demonstrate the efficient creation of the singlet state (a special type of Bell state) directly from thermal equilibrium, using an optimum sequence of pulses.
Resumo:
Various logical formalisms with the freeze quantifier have been recently considered to model computer systems even though this is a powerful mechanism that often leads to undecidability. In this article, we study a linear-time temporal logic with past-time operators such that the freeze operator is only used to express that some value from an infinite set is repeated in the future or in the past. Such a restriction has been inspired by a recent work on spatio-temporal logics that suggests such a restricted use of the freeze operator. We show decidability of finitary and infinitary satisfiability by reduction into the verification of temporal properties in Petri nets by proposing a symbolic representation of models. This is a quite surprising result in view of the expressive power of the logic since the logic is closed under negation, contains future-time and past-time temporal operators and can express the nonce property and its negation. These ingredients are known to lead to undecidability with a more liberal use of the freeze quantifier. The article also contains developments about the relationships between temporal logics with the freeze operator and counter automata as well as reductions into first-order logics over data words.
Resumo:
This article does not have an abstract.
Resumo:
This article is concerned with the evolution of haploid organisms that reproduce asexually. In a seminal piece of work, Eigen and coauthors proposed the quasispecies model in an attempt to understand such an evolutionary process. Their work has impacted antiviral treatment and vaccine design strategies. Yet, predictions of the quasispecies model are at best viewed as a guideline, primarily because it assumes an infinite population size, whereas realistic population sizes can be quite small. In this paper we consider a population genetics-based model aimed at understanding the evolution of such organisms with finite population sizes and present a rigorous study of the convergence and computational issues that arise therein. Our first result is structural and shows that, at any time during the evolution, as the population size tends to infinity, the distribution of genomes predicted by our model converges to that predicted by the quasispecies model. This justifies the continued use of the quasispecies model to derive guidelines for intervention. While the stationary state in the quasispecies model is readily obtained, due to the explosion of the state space in our model, exact computations are prohibitive. Our second set of results are computational in nature and address this issue. We derive conditions on the parameters of evolution under which our stochastic model mixes rapidly. Further, for a class of widely used fitness landscapes we give a fast deterministic algorithm which computes the stationary distribution of our model. These computational tools are expected to serve as a framework for the modeling of strategies for the deployment of mutagenic drugs.
Resumo:
Bidirectional relaying, where a relay helps two user nodes to exchange equal length binary messages, has been an active area of recent research. A popular strategy involves a modified Gaussian MAC, where the relay decodes the XOR of the two messages using the naturally-occurring sum of symbols simultaneously transmitted by user nodes. In this work, we consider the Gaussian MAC in bidirectional relaying with an additional secrecy constraint for protection against a honest but curious relay. The constraint is that, while the relay should decode the XOR, it should be fully ignorant of the individual messages of the users. We exploit the symbol addition that occurs in a Gaussian MAC to design explicit strategies that achieve perfect independence between the received symbols and individual transmitted messages. Our results actually hold for a more general scenario where the messages at the two user nodes come from a finite Abelian group G, and the relay must decode the sum within G of the two messages. We provide a lattice coding strategy and study optimal rate versus average power trade-offs for asymptotically large dimensions.
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.
Resumo:
Border basis detection (BBD) is described as follows: given a set of generators of an ideal, decide whether that set of generators is a border basis of the ideal with respect to some order ideal. The motivation for this problem comes from a similar problem related to Grobner bases termed as Grobner basis detection (GBD) which was proposed by Gritzmann and Sturmfels (1993). GBD was shown to be NP-hard by Sturmfels and Wiegelmann (1996). In this paper, we investigate the computational complexity of BBD and show that it is NP-complete.
Resumo:
We address the classical problem of delta feature computation, and interpret the operation involved in terms of Savitzky- Golay (SG) filtering. Features such as themel-frequency cepstral coefficients (MFCCs), obtained based on short-time spectra of the speech signal, are commonly used in speech recognition tasks. In order to incorporate the dynamics of speech, auxiliary delta and delta-delta features, which are computed as temporal derivatives of the original features, are used. Typically, the delta features are computed in a smooth fashion using local least-squares (LS) polynomial fitting on each feature vector component trajectory. In the light of the original work of Savitzky and Golay, and a recent article by Schafer in IEEE Signal Processing Magazine, we interpret the dynamic feature vector computation for arbitrary derivative orders as SG filtering with a fixed impulse response. This filtering equivalence brings in significantly lower latency with no loss in accuracy, as validated by results on a TIMIT phoneme recognition task. The SG filters involved in dynamic parameter computation can be viewed as modulation filters, proposed by Hermansky.
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for LVCSR systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication.In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on a 1138 word vocabulary RM1 task using Sphinx 3.7 system show that, for a typical case the matrix multiplication approach leads to overall speedup of 46%. Both the low-rank approximation methods increase the speedup to around 60%, with the former method increasing the word error rate (WER) from 3.2% to 6.6%, while the latter increases the WER from 3.2% to 3.5%.
Resumo:
In this paper, we consider a distributed function computation setting, where there are m distributed but correlated sources X1,...,Xm and a receiver interested in computing an s-dimensional subspace generated by [X1,...,Xm]Γ for some (m × s) matrix Γ of rank s. We construct a scheme based on nested linear codes and characterize the achievable rates obtained using the scheme. The proposed nested-linear-code approach performs at least as well as the Slepian-Wolf scheme in terms of sum-rate performance for all subspaces and source distributions. In addition, for a large class of distributions and subspaces, the scheme improves upon the Slepian-Wolf approach. The nested-linear-code scheme may be viewed as uniting under a common framework, both the Korner-Marton approach of using a common linear encoder as well as the Slepian-Wolf approach of employing different encoders at each source. Along the way, we prove an interesting and fundamental structural result on the nature of subspaces of an m-dimensional vector space V with respect to a normalized measure of entropy. Here, each element in V corresponds to a distinct linear combination of a set {Xi}im=1 of m random variables whose joint probability distribution function is given.
Resumo:
Let X-1,..., X-m be a set of m statistically dependent sources over the common alphabet F-q, that are linearly independent when considered as functions over the sample space. We consider a distributed function computation setting in which the receiver is interested in the lossless computation of the elements of an s-dimensional subspace W spanned by the elements of the row vector X-1,..., X-m]Gamma in which the (m x s) matrix Gamma has rank s. A sequence of three increasingly refined approaches is presented, all based on linear encoders. The first approach uses a common matrix to encode all the sources and a Korner-Marton like receiver to directly compute W. The second improves upon the first by showing that it is often more efficient to compute a carefully chosen superspace U of W. The superspace is identified by showing that the joint distribution of the {X-i} induces a unique decomposition of the set of all linear combinations of the {X-i}, into a chain of subspaces identified by a normalized measure of entropy. This subspace chain also suggests a third approach, one that employs nested codes. For any joint distribution of the {X-i} and any W, the sum-rate of the nested code approach is no larger than that under the Slepian-Wolf (SW) approach. Under the SW approach, W is computed by first recovering each of the {X-i}. For a large class of joint distributions and subspaces W, the nested code approach is shown to improve upon SW. Additionally, a class of source distributions and subspaces are identified, for which the nested-code approach is sum-rate optimal.
Resumo:
GPUs have been used for parallel execution of DOALL loops. However, loops with indirect array references can potentially cause cross iteration dependences which are hard to detect using existing compilation techniques. Applications with such loops cannot easily use the GPU and hence do not benefit from the tremendous compute capabilities of GPUs. In this paper, we present an algorithm to compute at runtime the cross iteration dependences in such loops. The algorithm uses both the CPU and the GPU to compute the dependences. Specifically, it effectively uses the compute capabilities of the GPU to quickly collect the memory accesses performed by the iterations by executing the slice functions generated for the indirect array accesses. Using the dependence information, the loop iterations are levelized such that each level contains independent iterations which can be executed in parallel. Another interesting aspect of the proposed solution is that it pipelines the dependence computation of the future level with the actual computation of the current level to effectively utilize the resources available in the GPU. We use NVIDIA Tesla C2070 to evaluate our implementation using benchmarks from Polybench suite and some synthetic benchmarks. Our experiments show that the proposed technique can achieve an average speedup of 6.4x on loops with a reasonable number of cross iteration dependences.