77 resultados para Robust feasibility
Resumo:
We develop a new dictionary learning algorithm called the l(1)-K-svp, by minimizing the l(1) distortion on the data term. The proposed formulation corresponds to maximum a posteriori estimation assuming a Laplacian prior on the coefficient matrix and additive noise, and is, in general, robust to non-Gaussian noise. The l(1) distortion is minimized by employing the iteratively reweighted least-squares algorithm. The dictionary atoms and the corresponding sparse coefficients are simultaneously estimated in the dictionary update step. Experimental results show that l(1)-K-SVD results in noise-robustness, faster convergence, and higher atom recovery rate than the method of optimal directions, K-SVD, and the robust dictionary learning algorithm (RDL), in Gaussian as well as non-Gaussian noise. For a fixed value of sparsity, number of dictionary atoms, and data dimension, l(1)-K-SVD outperforms K-SVD and RDL on small training sets. We also consider the generalized l(p), 0 < p < 1, data metric to tackle heavy-tailed/impulsive noise. In an image denoising application, l(1)-K-SVD was found to result in higher peak signal-to-noise ratio (PSNR) over K-SVD for Laplacian noise. The structural similarity index increases by 0.1 for low input PSNR, which is significant and demonstrates the efficacy of the proposed method. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Acoustic feature based speech (syllable) rate estimation and syllable nuclei detection are important problems in automatic speech recognition (ASR), computer assisted language learning (CALL) and fluency analysis. A typical solution for both the problems consists of two stages. The first stage involves computing a short-time feature contour such that most of the peaks of the contour correspond to the syllabic nuclei. In the second stage, the peaks corresponding to the syllable nuclei are detected. In this work, instead of the peak detection, we perform a mode-shape classification, which is formulated as a supervised binary classification problem - mode-shapes representing the syllabic nuclei as one class and remaining as the other. We use the temporal correlation and selected sub-band correlation (TCSSBC) feature contour and the mode-shapes in the TCSSBC feature contour are converted into a set of feature vectors using an interpolation technique. A support vector machine classifier is used for the classification. Experiments are performed separately using Switchboard, TIMIT and CTIMIT corpora in a five-fold cross validation setup. The average correlation coefficients for the syllable rate estimation turn out to be 0.6761, 0.6928 and 0.3604 for three corpora respectively, which outperform those obtained by the best of the existing peak detection techniques. Similarly, the average F-scores (syllable level) for the syllable nuclei detection are 0.8917, 0.8200 and 0.7637 for three corpora respectively. (C) 2016 Elsevier B.V. All rights reserved.