236 resultados para 080205 Numerical Computation
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.
Resumo:
Laminar two-dimensional sudden expansion flow of different nanofluids is studied numerically. The governing equations are solved using stream function-vorticity method. The effect of volume fraction of the nanoparticles and type of nanoparticles on flow behaviour is examined and found significant impact. The flow response to Reynolds number in the presence of nanoparticles is examined. The presence of nanoparticles decreases the flow bifurcation Reynolds number. The size and the reattachment length of the bottom wall recirculation increase with increasing volume fraction and particle density. The effect of volume fraction and density of nanoparticles on friction factor is reported. The bottom wall recirculation strongly respond to the variation in volume faction and type of particles. However, weak response is observed for top wall recirculation.
Resumo:
We address the classical problem of delta feature computation, and interpret the operation involved in terms of Savitzky- Golay (SG) filtering. Features such as themel-frequency cepstral coefficients (MFCCs), obtained based on short-time spectra of the speech signal, are commonly used in speech recognition tasks. In order to incorporate the dynamics of speech, auxiliary delta and delta-delta features, which are computed as temporal derivatives of the original features, are used. Typically, the delta features are computed in a smooth fashion using local least-squares (LS) polynomial fitting on each feature vector component trajectory. In the light of the original work of Savitzky and Golay, and a recent article by Schafer in IEEE Signal Processing Magazine, we interpret the dynamic feature vector computation for arbitrary derivative orders as SG filtering with a fixed impulse response. This filtering equivalence brings in significantly lower latency with no loss in accuracy, as validated by results on a TIMIT phoneme recognition task. The SG filters involved in dynamic parameter computation can be viewed as modulation filters, proposed by Hermansky.
Operator-splitting finite element algorithms for computations of high-dimensional parabolic problems
Resumo:
An operator-splitting finite element method for solving high-dimensional parabolic equations is presented. The stability and the error estimates are derived for the proposed numerical scheme. Furthermore, two variants of fully-practical operator-splitting finite element algorithms based on the quadrature points and the nodal points, respectively, are presented. Both the quadrature and the nodal point based operator-splitting algorithms are validated using a three-dimensional (3D) test problem. The numerical results obtained with the full 3D computations and the operator-split 2D + 1D computations are found to be in a good agreement with the analytical solution. Further, the optimal order of convergence is obtained in both variants of the operator-splitting algorithms. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for LVCSR systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication.In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on a 1138 word vocabulary RM1 task using Sphinx 3.7 system show that, for a typical case the matrix multiplication approach leads to overall speedup of 46%. Both the low-rank approximation methods increase the speedup to around 60%, with the former method increasing the word error rate (WER) from 3.2% to 6.6%, while the latter increases the WER from 3.2% to 3.5%.
Resumo:
In this paper, we consider a distributed function computation setting, where there are m distributed but correlated sources X1,...,Xm and a receiver interested in computing an s-dimensional subspace generated by [X1,...,Xm]Γ for some (m × s) matrix Γ of rank s. We construct a scheme based on nested linear codes and characterize the achievable rates obtained using the scheme. The proposed nested-linear-code approach performs at least as well as the Slepian-Wolf scheme in terms of sum-rate performance for all subspaces and source distributions. In addition, for a large class of distributions and subspaces, the scheme improves upon the Slepian-Wolf approach. The nested-linear-code scheme may be viewed as uniting under a common framework, both the Korner-Marton approach of using a common linear encoder as well as the Slepian-Wolf approach of employing different encoders at each source. Along the way, we prove an interesting and fundamental structural result on the nature of subspaces of an m-dimensional vector space V with respect to a normalized measure of entropy. Here, each element in V corresponds to a distinct linear combination of a set {Xi}im=1 of m random variables whose joint probability distribution function is given.
Resumo:
A combined 3D finite element simulation and experimental study of interaction between a notch and cylindrical voids ahead of it in single edge notch (tension) aluminum single crystal specimens is undertaken in this work. Two lattice orientations are considered in which the notch front is parallel to the crystallographic 10 (1) over bar] direction. The flat surface of the notch coincides with the (010) plane in one orientation and with the (1 (1) over bar1) plane in the other. Three equally spaced cylindrical voids are placed directly ahead of the notch tip. The predicted load-displacement curves, slip traces, lattice rotation and void growth from the finite element analysis are found to be in good agreement with the experimental observations for both the orientations. Finite element results show considerable through-thickness variation in both hydrostatic stress and equivalent plastic slip which, however, depends additionally on the lattice orientation. The through-thickness variation in the above quantities affects the void growth rate and causes it to differ from the center-plane to the free surface of the specimen. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Let X-1,..., X-m be a set of m statistically dependent sources over the common alphabet F-q, that are linearly independent when considered as functions over the sample space. We consider a distributed function computation setting in which the receiver is interested in the lossless computation of the elements of an s-dimensional subspace W spanned by the elements of the row vector X-1,..., X-m]Gamma in which the (m x s) matrix Gamma has rank s. A sequence of three increasingly refined approaches is presented, all based on linear encoders. The first approach uses a common matrix to encode all the sources and a Korner-Marton like receiver to directly compute W. The second improves upon the first by showing that it is often more efficient to compute a carefully chosen superspace U of W. The superspace is identified by showing that the joint distribution of the {X-i} induces a unique decomposition of the set of all linear combinations of the {X-i}, into a chain of subspaces identified by a normalized measure of entropy. This subspace chain also suggests a third approach, one that employs nested codes. For any joint distribution of the {X-i} and any W, the sum-rate of the nested code approach is no larger than that under the Slepian-Wolf (SW) approach. Under the SW approach, W is computed by first recovering each of the {X-i}. For a large class of joint distributions and subspaces W, the nested code approach is shown to improve upon SW. Additionally, a class of source distributions and subspaces are identified, for which the nested-code approach is sum-rate optimal.
Resumo:
Theterahertz (THz) propagation in real tissues causes heating as with any other electromagnetic radiation propagation. A finite-element (FE) model that provides numerical solutions to the heat conduction equation coupled with realistic models of tissues is employed in this study to compute the temperature raise due to THz propagation. The results indicate that the temperature raise is dependent on the tissue type and is highly localized. The developed FE model was validated through obtaining solutions for the steady-state case and showing that they were in good agreement with the analytical solutions. These types of models can also enable computation of specific absorption rates, which are very critical in planning/setting up experiments involving biological tissues.
Resumo:
Outlier detection in high dimensional categorical data has been a problem of much interest due to the extensive use of qualitative features for describing the data across various application areas. Though there exist various established methods for dealing with the dimensionality aspect through feature selection on numerical data, the categorical domain is actively being explored. As outlier detection is generally considered as an unsupervised learning problem due to lack of knowledge about the nature of various types of outliers, the related feature selection task also needs to be handled in a similar manner. This motivates the need to develop an unsupervised feature selection algorithm for efficient detection of outliers in categorical data. Addressing this aspect, we propose a novel feature selection algorithm based on the mutual information measure and the entropy computation. The redundancy among the features is characterized using the mutual information measure for identifying a suitable feature subset with less redundancy. The performance of the proposed algorithm in comparison with the information gain based feature selection shows its effectiveness for outlier detection. The efficacy of the proposed algorithm is demonstrated on various high-dimensional benchmark data sets employing two existing outlier detection methods.
Resumo:
The governing differential equation of the rotating beam reduces to that of a stiff string when the centrifugal force is assumed as constant. The solution of the static homogeneous part of this equation is enhanced with a polynomial term and used in the Rayleighs method. Numerical experiments show better agreement with converged finite element solutions compared to polynomials. Using this as an estimate for the first mode shape, higher mode shape approximations are obtained using Gram-Schmidt orthogonalization. Estimates for the first five natural frequencies of uniform and tapered beams are obtained accurately using a very low order Rayleigh-Ritz approximation.
Resumo:
GPUs have been used for parallel execution of DOALL loops. However, loops with indirect array references can potentially cause cross iteration dependences which are hard to detect using existing compilation techniques. Applications with such loops cannot easily use the GPU and hence do not benefit from the tremendous compute capabilities of GPUs. In this paper, we present an algorithm to compute at runtime the cross iteration dependences in such loops. The algorithm uses both the CPU and the GPU to compute the dependences. Specifically, it effectively uses the compute capabilities of the GPU to quickly collect the memory accesses performed by the iterations by executing the slice functions generated for the indirect array accesses. Using the dependence information, the loop iterations are levelized such that each level contains independent iterations which can be executed in parallel. Another interesting aspect of the proposed solution is that it pipelines the dependence computation of the future level with the actual computation of the current level to effectively utilize the resources available in the GPU. We use NVIDIA Tesla C2070 to evaluate our implementation using benchmarks from Polybench suite and some synthetic benchmarks. Our experiments show that the proposed technique can achieve an average speedup of 6.4x on loops with a reasonable number of cross iteration dependences.
Resumo:
We propose a novel numerical method based on a generalized eigenvalue decomposition for solving the diffusion equation governing the correlation diffusion of photons in turbid media. Medical imaging modalities such as diffuse correlation tomography and ultrasound-modulated optical tomography have the (elliptic) diffusion equation parameterized by a time variable as the forward model. Hitherto, for the computation of the correlation function, the diffusion equation is solved repeatedly over the time parameter. We show that the use of a certain time-independent generalized eigenfunction basis results in the decoupling of the spatial and time dependence of the correlation function, thus allowing greater computational efficiency in arriving at the forward solution. Besides presenting the mathematical analysis of the generalized eigenvalue problem on the basis of spectral theory, we put forth the numerical results that compare the proposed numerical method with the standard technique for solving the diffusion equation.
Resumo:
Charnockite is considered to be generated either through the dehydration of granitic magma by CO2 purging or by solid-state dehydration through CO2 metasomatism during granulite facies metamorphism. To understand the extent of dehydration, CO2 migration is quantitatively modeled in silicate melt and metasomatic fluid as a function of temperature, H2O wt%, pressure, basal CO2 flux and dynamic viscosity. Numerical simulations show that CO2 advection through porous and permeable high-grade metamorphic rocks can generate dehydrated patches close to the CO2 flow path, as illustrated by the occurrences of ``incipient charnockites.'' CO2 reaction-front velocity constrained by field observations is 0.69 km/m.y., a reasonable value, which matches well with other studies. On the other hand, temperature, rate of cooling, and basal CO2 flux are the critical parameters affecting CO2 diffusion through a silicate melt. CO2 diffusion through silicate melt can only occur at temperature greater than 840 degrees C and during slow cooling (<= 3.7 x 10(-5) degrees C/yr), features that are typical of magma emplacement in the lower crust. Stalling of CO2 fluxing at similar to 840 degrees C explains why some deep-level plutons contain both hydrous and anhydrous (charnockitic) mineral assemblages. CO2 diffusion through silicate melt is virtually insensitive to pressure. Addition of CO2 basal flux facilitates episodic dehydrated melt migration by generating fracture pathways.