145 resultados para motion computation
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.
Resumo:
We investigate the effect of a prescribed tangential velocity on the drag force on a circular cylinder in a spanwise uniform cross flow. Using a combination of theoretical and numerical techniques we make an attempt at determining the optimal tangential velocity profiles which will reduce the drag force acting on the cylindrical body while minimizing the net power consumption characterized through a non-dimensional power loss coefficient (C-PL). A striking conclusion of our analysis is that the tangential velocity associated with the potential flow, which completely suppresses the drag force, is not optimal for both small and large, but finite Reynolds number. When inertial effects are negligible (R e << 1), theoretical analysis based on two-dimensional Oseen equations gives us the optimal tangential velocity profile which leads to energetically efficient drag reduction. Furthermore, in the limit of zero Reynolds number (Re -> 0), minimum power loss is achieved for a tangential velocity profile corresponding to a shear-free perfect slip boundary. At finite Re, results from numerical simulations indicate that perfect slip is not optimum and a further reduction in drag can be achieved for reduced power consumption. A gradual increase in the strength of a tangential velocity which involves only the first reflectionally symmetric mode leads to a monotonic reduction in drag and eventual thrust production. Simulations reveal the existence of an optimal strength for which the power consumption attains a minima. At a Reynolds number of 100, minimum value of the power loss coefficient (C-PL = 0.37) is obtained when the maximum in tangential surface velocity is about one and a half times the free stream uniform velocity corresponding to a percentage drag reduction of approximately 77 %; C-PL = 0.42 and 0.50 for perfect slip and potential flow cases, respectively. Our results suggest that potential flow tangential velocity enables energetically efficient propulsion at all Reynolds numbers but optimal drag reduction only for Re -> infinity. The two-dimensional strategy of reducing drag while minimizing net power consumption is shown to be effective in three dimensions via numerical simulation of flow past an infinite circular cylinder at a Reynolds number of 300. Finally a strategy of reducing drag, suitable for practical implementation and amenable to experimental testing, through piecewise constant tangential velocities distributed along the cylinder periphery is proposed and analysed.
Resumo:
We address the classical problem of delta feature computation, and interpret the operation involved in terms of Savitzky- Golay (SG) filtering. Features such as themel-frequency cepstral coefficients (MFCCs), obtained based on short-time spectra of the speech signal, are commonly used in speech recognition tasks. In order to incorporate the dynamics of speech, auxiliary delta and delta-delta features, which are computed as temporal derivatives of the original features, are used. Typically, the delta features are computed in a smooth fashion using local least-squares (LS) polynomial fitting on each feature vector component trajectory. In the light of the original work of Savitzky and Golay, and a recent article by Schafer in IEEE Signal Processing Magazine, we interpret the dynamic feature vector computation for arbitrary derivative orders as SG filtering with a fixed impulse response. This filtering equivalence brings in significantly lower latency with no loss in accuracy, as validated by results on a TIMIT phoneme recognition task. The SG filters involved in dynamic parameter computation can be viewed as modulation filters, proposed by Hermansky.
Resumo:
Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for LVCSR systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication.In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on a 1138 word vocabulary RM1 task using Sphinx 3.7 system show that, for a typical case the matrix multiplication approach leads to overall speedup of 46%. Both the low-rank approximation methods increase the speedup to around 60%, with the former method increasing the word error rate (WER) from 3.2% to 6.6%, while the latter increases the WER from 3.2% to 3.5%.
Resumo:
In this paper, we consider a distributed function computation setting, where there are m distributed but correlated sources X1,...,Xm and a receiver interested in computing an s-dimensional subspace generated by [X1,...,Xm]Γ for some (m × s) matrix Γ of rank s. We construct a scheme based on nested linear codes and characterize the achievable rates obtained using the scheme. The proposed nested-linear-code approach performs at least as well as the Slepian-Wolf scheme in terms of sum-rate performance for all subspaces and source distributions. In addition, for a large class of distributions and subspaces, the scheme improves upon the Slepian-Wolf approach. The nested-linear-code scheme may be viewed as uniting under a common framework, both the Korner-Marton approach of using a common linear encoder as well as the Slepian-Wolf approach of employing different encoders at each source. Along the way, we prove an interesting and fundamental structural result on the nature of subspaces of an m-dimensional vector space V with respect to a normalized measure of entropy. Here, each element in V corresponds to a distinct linear combination of a set {Xi}im=1 of m random variables whose joint probability distribution function is given.
Resumo:
Let X-1,..., X-m be a set of m statistically dependent sources over the common alphabet F-q, that are linearly independent when considered as functions over the sample space. We consider a distributed function computation setting in which the receiver is interested in the lossless computation of the elements of an s-dimensional subspace W spanned by the elements of the row vector X-1,..., X-m]Gamma in which the (m x s) matrix Gamma has rank s. A sequence of three increasingly refined approaches is presented, all based on linear encoders. The first approach uses a common matrix to encode all the sources and a Korner-Marton like receiver to directly compute W. The second improves upon the first by showing that it is often more efficient to compute a carefully chosen superspace U of W. The superspace is identified by showing that the joint distribution of the {X-i} induces a unique decomposition of the set of all linear combinations of the {X-i}, into a chain of subspaces identified by a normalized measure of entropy. This subspace chain also suggests a third approach, one that employs nested codes. For any joint distribution of the {X-i} and any W, the sum-rate of the nested code approach is no larger than that under the Slepian-Wolf (SW) approach. Under the SW approach, W is computed by first recovering each of the {X-i}. For a large class of joint distributions and subspaces W, the nested code approach is shown to improve upon SW. Additionally, a class of source distributions and subspaces are identified, for which the nested-code approach is sum-rate optimal.
Resumo:
In this paper we present a segmentation algorithm to extract foreground object motion in a moving camera scenario without any preprocessing step such as tracking selected features, video alignment, or foreground segmentation. By viewing it as a curve fitting problem on advected particle trajectories, we use RANSAC to find the polynomial that best fits the camera motion and identify all trajectories that correspond to the camera motion. The remaining trajectories are those due to the foreground motion. By using the superposition principle, we subtract the motion due to camera from foreground trajectories and obtain the true object-induced trajectories. We show that our method performs on par with state-of-the-art technique, with an execution time speed-up of 10x-40x. We compare the results on real-world datasets such as UCF-ARG, UCF Sports and Liris-HARL. We further show that it can be used toper-form video alignment.
Resumo:
GPUs have been used for parallel execution of DOALL loops. However, loops with indirect array references can potentially cause cross iteration dependences which are hard to detect using existing compilation techniques. Applications with such loops cannot easily use the GPU and hence do not benefit from the tremendous compute capabilities of GPUs. In this paper, we present an algorithm to compute at runtime the cross iteration dependences in such loops. The algorithm uses both the CPU and the GPU to compute the dependences. Specifically, it effectively uses the compute capabilities of the GPU to quickly collect the memory accesses performed by the iterations by executing the slice functions generated for the indirect array accesses. Using the dependence information, the loop iterations are levelized such that each level contains independent iterations which can be executed in parallel. Another interesting aspect of the proposed solution is that it pipelines the dependence computation of the future level with the actual computation of the current level to effectively utilize the resources available in the GPU. We use NVIDIA Tesla C2070 to evaluate our implementation using benchmarks from Polybench suite and some synthetic benchmarks. Our experiments show that the proposed technique can achieve an average speedup of 6.4x on loops with a reasonable number of cross iteration dependences.
Resumo:
For one-dimensional flexible objects such as ropes, chains, hair, the assumption of constant length is realistic for large-scale 3D motion. Moreover, when the motion or disturbance at one end gradually dies down along the curve defining the one-dimensional flexible objects, the motion appears ``natural''. This paper presents a purely geometric and kinematic approach for deriving more natural and length-preserving transformations of planar and spatial curves. Techniques from variational calculus are used to determine analytical conditions and it is shown that the velocity at any point on the curve must be along the tangent at that point for preserving the length and to yield the feature of diminishing motion. It is shown that for the special case of a straight line, the analytical conditions lead to the classical tractrix curve solution. Since analytical solutions exist for a tractrix curve, the motion of a piecewise linear curve can be solved in closed-form and thus can be applied for the resolution of redundancy in hyper-redundant robots. Simulation results for several planar and spatial curves and various input motions of one end are used to illustrate the features of motion damping and eventual alignment with the perturbation vector.
Resumo:
In the present investigation, efforts were made to study the different frictional responses of materials with varying crystal structure and hardness during sliding against a relatively harder material of different surface textures and roughness. In the experiments, pins were made of pure metals and alloys with significantly different hardness values. Pure metals were selected based on different class of crystal structures, such as face centered cubic (FCC), body centered cubic (BCC), body centered tetragonal (BCT) and hexagonal close packed (HCP) structures. The surface textures with varying roughness were generated on the counterpart plate which was made of H-11 die steel. The experiments were conducted under dry and lubricated conditions using an inclined pin-on-plate sliding tester for various normal loads at ambient environment. In the experiments, it was found that the coefficient of friction is controlled by the surface texture of the harder mating surfaces. Further, two kinds of frictional response, namely steady-state and stick-slip, were observed during sliding. More specifically, stead-state frictional response was observed for the FCC metals, alloys and materials with higher hardness. Stick-slip frictional response was observed for the metals which have limited number of slip systems such as BCT and HCP. In addition, the stick-slip frictional response was dependent on the normal load, lubrication, hardness and surface texture of the counterpart material. However, for a given kind of surface texture, the roughness of the surface affects neither the average coefficient of friction nor the amplitude of stick-slip oscillation significantly.
Resumo:
Nearly pollution-free solutions of the Helmholtz equation for k-values corresponding to visible light are demonstrated and verified through experimentally measured forward scattered intensity from an optical fiber. Numerically accurate solutions are, in particular, obtained through a novel reformulation of the H-1 optimal Petrov-Galerkin weak form of the Helmholtz equation. Specifically, within a globally smooth polynomial reproducing framework, the compact and smooth test functions are so designed that their normal derivatives are zero everywhere on the local boundaries of their compact supports. This circumvents the need for a priori knowledge of the true solution on the support boundary and relieves the weak form of any jump boundary terms. For numerical demonstration of the above formulation, we used a multimode optical fiber in an index matching liquid as the object. The scattered intensity and its normal derivative are computed from the scattered field obtained by solving the Helmholtz equation, using the new formulation and the conventional finite element method. By comparing the results with the experimentally measured scattered intensity, the stability of the solution through the new formulation is demonstrated and its closeness to the experimental measurements verified.
Resumo:
Himalayan region is one of the most active seismic regions in the world and many researchers have highlighted the possibility of great seismic event in the near future due to seismic gap. Seismic hazard analysis and microzonation of highly populated places in the region are mandatory in a regional scale. Region specific Ground Motion Predictive Equation (GMPE) is an important input in the seismic hazard analysis for macro- and micro-zonation studies. Few GMPEs developed in India are based on the recorded data and are applicable for a particular range of magnitudes and distances. This paper focuses on the development of a new GMPE for the Himalayan region considering both the recorded and simulated earthquakes of moment magnitude 5.3-8.7. The Finite Fault simulation model has been used for the ground motion simulation considering region specific seismotectonic parameters from the past earthquakes and source models. Simulated acceleration time histories and response spectra are compared with available records. In the absence of a large number of recorded data, simulations have been performed at unavailable locations by adopting Apparent Stations concept. Earthquakes recorded up to 2007 have been used for the development of new GMPE and earthquakes records after 2007 are used to validate new GMPE. Proposed GMPE matched very well with recorded data and also with other highly ranked GMPEs developed elsewhere and applicable for the region. Comparison of response spectra also have shown good agreement with recorded earthquake data. Quantitative analysis of residuals for the proposed GMPE and region specific GMPEs to predict Nepal-India 2011 earthquake of Mw of 5.7 records values shows that the proposed GMPE predicts Peak ground acceleration and spectral acceleration for entire distance and period range with lower percent residual when compared to exiting region specific GMPEs. Crown Copyright (C) 2013 Published by Elsevier Ltd. All rights reserved.
Resumo:
Measurement of in-plane motion with high resolution and large bandwidth enables model-identification and real-time control of motion-stages. This paper presents an optical beam deflection based system for measurement of in-plane motion of both macro- and micro-scale motion stages. A curved reflector is integrated with the motion stage to achieve sensitivity to in-plane translational motion along two axes. Under optimal settings, the measurement system is shown to theoretically achieve sub-angstrom measurement resolution over a bandwidth in excess of 1 kHz and negligible cross-sensitivity to linear motion. Subsequently, the proposed technique is experimentally demonstrated by measuring the in-plane motion of a piezo flexure stage and a scanning probe microcantilever. For the former case, reflective spherical balls of different radii are employed to measure the in-plane motion and the measured sensitivities are shown to agree with theoretical values, on average, to within 8.3%. For the latter case, a prototype polydimethylsiloxane micro-reflector is integrated with the microcantilever. The measured in-plane motion of the microcantilever probe is used to identify nonlinearities and the transient dynamics of the piezo-stage upon which the probe is mounted. These are subsequently compensated by means of feedback control. (C) 2013 AIP Publishing LLC.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.