90 resultados para Speech articulation tests


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a novel temporal feature of a signal, namely extrema-based signal track length (ESTL) for the problem of speech segmentation. We show that ESTL measure is sensitive to both amplitude and frequency of the signal. The short-time ESTL (ST_ESTL) shows a promising way to capture the significant segments of speech signal, where the segments correspond to acoustic units of speech having distinct temporal waveforms. We compare ESTL based segmentation with ML and STM methods and find that it is as good as spectral feature based segmentation, but with lesser computational complexity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper considers the high-rate performance of source coding for noisy discrete symmetric channels with random index assignment (IA). Accurate analytical models are developed to characterize the expected distortion performance of vector quantization (VQ) for a large class of distortion measures. It is shown that when the point density is continuous, the distortion can be approximated as the sum of the source quantization distortion and the channel-error induced distortion. Expressions are also derived for the continuous point density that minimizes the expected distortion. Next, for the case of mean squared error distortion, a more accurate analytical model for the distortion is derived by allowing the point density to have a singular component. The extent of the singularity is also characterized. These results provide analytical models for the expected distortion performance of both conventional VQ as well as for channel-optimized VQ. As a practical example, compression of the linear predictive coding parameters in the wideband speech spectrum is considered, with the log spectral distortion as performance metric. The theory is able to correctly predict the channel error rate that is permissible for operation at a particular level of distortion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design and operation of the minimum cost classifier, where the total cost is the sum of the measurement cost and the classification cost, is computationally complex. Noting the difficulties associated with this approach, decision tree design directly from a set of labelled samples is proposed in this paper. The feature space is first partitioned to transform the problem to one of discrete features. The resulting problem is solved by a dynamic programming algorithm over an explicitly ordered state space of all outcomes of all feature subsets. The solution procedure is very general and is applicable to any minimum cost pattern classification problem in which each feature has a finite number of outcomes. These techniques are applied to (i) voiced, unvoiced, and silence classification of speech, and (ii) spoken vowel recognition. The resulting decision trees are operationally very efficient and yield attractive classification accuracies.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel method is proposed for fracture toughness determination of graded microstructurally complex (Pt,Ni)Al bond coats using edge-notched doubly clamped beams subjected to bending. Micron-scale beams are machined using the focused ion beam and loaded in bending under a nanoindenter. Failure loads gathered from the pop-ins in the load-displacement curves combined with XFEM analysis are used to calculate K-c at individual zones, free from substrate effects. The testing technique and sources of errors in measurement are described and possible micromechanisms of fracture in such heterogeneous coatings discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ampcalculator (AMPC) is a Mathematica (c) based program that was made publicly available some time ago by Unterdorfer and Ecker. It enables the user to compute several processes at one loop (upto O(p(4))) in SU(3) chiral perturbation theory. They include computing matrix elements and form factors for strong and non-leptonic weak processes with at most six external states. It was used to compute some novel processes and was tested against well-known results by the original authors. Here we present the results of several thorough checks of the package. Exhaustive checks performed by the original authors are not publicly available, and hence the present effort. Some new results are obtained from the software especially in the kaon odd-intrinsic parity non-leptonic decay sector involving the coupling G(27). Another illustrative set of amplitudes at tree level we provide is in the context of tau-decays with several mesons including quark mass effects, of use to the BELLE experiment. All eight meson-meson scattering amplitudes have been checked. The Kaon-Compton amplitude has been checked and a minor error in the published results has been pointed out. This exercise is a tutorial-based one, wherein several input and output notebooks are also being made available as ancillary files on the arXiv. Some of the additional notebooks we provide contain explicit expressions that we have used for comparison with established results. The purpose is to encourage users to apply the software to suit their specific needs. An automatic amplitude generator of this type can provide error-free outputs that could be used as inputs for further simplification, and in varied scenarios such as applications of chiral perturbation theory at finite temperature, density and volume. This can also be used by students as a learning aid in low-energy hadron dynamics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper studies the effect of frequency of base shaking on the dynamic response of unreinforced and reinforced soil slopes through a series of shaking table tests. Slopes were constructed using clayey sand and geogrids were used for reinforcing the slopes. Two different slope angles 45 degrees and 60 degrees were used in tests and the quantity and location of reinforcement is varied in different tests. Acceleration of shaking is kept constant as 0.3 g in all the tests to maximize the response and the frequency of shaking was 2 Hz, 5 Hz and 7 Hz in different tests. The slope is instrumented with ultrasonic displacement sensors and accelerometers at different elevations. The response of different slopes is compared in terms of the deformation of the slope and acceleration amplifications measured at different elevations. It is observed that the displacements at all elevations increased with increase in frequency for all slopes, whereas the effect of frequency on acceleration amplifications is not significant for reinforced slopes. Results showed that the acceleration and displacement response is not increasing proportionately with the increase in the frequency, suggesting that the role of frequency in the seismic response is very important. Reinforced slopes showed lesser displacements compared to unreinforced slopes at all frequency levels. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the speech production mechanism and the asso- ciated linear source-filter model. For voiced speech sounds in particular, the source/glottal excitation is modeled as a stream of impulses and the filter as a cascade of second-order resonators. We show that the process of sampling speech signals can be modeled as filtering a stream of Dirac impulses (a model for the excitation) with a kernel function (the vocal tract response),and then sampling uniformly. We show that the problem of esti- mating the excitation is equivalent to the problem of recovering a stream of Dirac impulses from samples of a filtered version. We present associated algorithms based on the annihilating filter and also make a comparison with the classical linear prediction technique, which is well known in speech analysis. Results on synthesized as well as natural speech data are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of speech enhancement in real-world noisy scenarios. We propose to solve the problem in two stages, the first comprising a generalized spectral subtraction technique, followed by a sequence of perceptually-motivated post-processing algorithms. The role of the post-processing algorithms is to compensate for the effects of noise as well as to suppress any artifacts created by the first-stage processing. The key post-processing mechanisms are aimed at suppressing musical noise and to enhance the formant structure of voiced speech as well as to denoise the linear-prediction residual. The parameter values in the techniques are fixed optimally by experimentally evaluating the enhancement performance as a function of the parameters. We used the Carnegie-Mellon university Arctic database for our experiments. We considered three real-world noise types: fan noise, car noise, and motorbike noise. The enhancement performance was evaluated by conducting listening experiments on 12 subjects. The listeners reported a clear improvement (MOS improvement of 0.5 on an average) over the noisy signal in the perceived quality (increase in the mean-opinion score (MOS)) for positive signal-to-noise-ratios (SNRs). For negative SNRs, however, the improvement was found to be marginal.