6 resultados para Perceptual Speech Evaluation
em Indian Institute of Science - Bangalore - Índia
Resumo:
We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of ``virtual pattern'' developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.
Resumo:
We address the problem of speech enhancement using a risk- estimation approach. In particular, we propose the use the Stein’s unbiased risk estimator (SURE) for solving the problem. The need for a suitable finite-sample risk estimator arises because the actual risks invariably depend on the unknown ground truth. We consider the popular mean-squared error (MSE) criterion first, and then compare it against the perceptually-motivated Itakura-Saito (IS) distortion, by deriving unbiased estimators of the corresponding risks. We use a generalized SURE (GSURE) development, recently proposed by Eldar for MSE. We consider dependent observation models from the exponential family with an additive noise model,and derive an unbiased estimator for the risk corresponding to the IS distortion, which is non-quadratic. This serves to address the speech enhancement problem in a more general setting. Experimental results illustrate that the IS metric is efficient in suppressing musical noise, which affects the MSE-enhanced speech. However, in terms of global signal-to-noise ratio (SNR), the minimum MSE solution gives better results.
Resumo:
Speech enhancement in stationary noise is addressed using the ideal channel selection framework. In order to estimate the binary mask, we propose to classify each time-frequency (T-F) bin of the noisy signal as speech or noise using Discriminative Random Fields (DRF). The DRF function contains two terms - an enhancement function and a smoothing term. On each T-F bin, we propose to use an enhancement function based on likelihood ratio test for speech presence, while Ising model is used as smoothing function for spectro-temporal continuity in the estimated binary mask. The effect of the smoothing function over successive iterations is found to reduce musical noise as opposed to using only enhancement function. The binary mask is inferred from the noisy signal using Iterated Conditional Modes (ICM) algorithm. Sentences from NOIZEUS corpus are evaluated from 0 dB to 15 dB Signal to Noise Ratio (SNR) in 4 kinds of additive noise settings: additive white Gaussian noise, car noise, street noise and pink noise. The reconstructed speech using the proposed technique is evaluated in terms of average segmental SNR, Perceptual Evaluation of Speech Quality (PESQ) and Mean opinion Score (MOS).
Resumo:
Inventory Management (IM) plays a decisive role in the enhancement of efficiency and competitiveness of manufacturing enterprises. Therefore, major manufacturing enterprises are following IM practices as a strategy to improve efficiency and achieve competitiveness. However, the spread of IM culture among Small and Medium Enterprises (SMEs) is limited due to lack of initiation, expertise and financial limitations in developed countries, leave alone developing countries. With this backdrop, this paper makes an attempt to ascertain the role and importance of IM practices and performance of SMEs in the machine tools industry of Bangalore, India. The relationship between inventory management practices and inventory cost are probed based on primary data gathered from 91 SMEs. The paper brings out that formal IM practices have a positive impact on the inventory performance of SMEs.
Resumo:
Plasma sprayable powders were prepared from ZrO2-CaO-CeO2 system using an organic binder and coated onto stainless steel substrates previously coated by a bond coat (Ni 22Cr 20Al 1.0Y) using plasma spraying. The coatings exhibited good thermal barrier characteristics and excellent resistance to thermal shock at 1000 degrees C under simulated laboratory conditions (90 half hour cycles without failure) and at 1200 degrees C under accelerated burner rig test conditions (500 2 min cycles without failure). No destabilization of cubic/tetragonal ZrO2 phase fraction occured either during the long hours (45 h cumulative) or the large number of thermal shock tests. Growth of a distinct SiO2 rich region within the ceramic was observed in the specimens thermal shock cycled at 1000 degrees C apart from mild oxidation of the bond coat. The specimens tested at 1200 degrees C had a glassy appearance on the top surface and exhibited severe oxidation of the bond coat at the ceramic-bond coat interface. The glassy appearance of the surface is due to the formation of a liquid silicate layer attributable to the impurity phase present in commercial grade ZrO2 powder. These observations are supported by SEM analysis and quantitative EDAX data.
Resumo:
Three new (dialkylamino)pyridine (DAAP)-based ligand amphiphiles 3-5 have been synthesized. All of the compounds possess a metal ion binding subunit in the form of a 2,6-disubstituted DAAP moiety. In addition, at least one ortho-CH2OH substituent is present in all the ligands. Complex formation by these ligands with various metal ions were examined under micellar conditions, but only complexes with Cu(II) ions showed kinetically potent esterolytic capacities under micellar conditions. Complexes with Cu(II) were prepared in host comicellar cetyltrimethylammonium bromide (CTABr) media at pH 7.6. Individual complexes were characterized by UV-visible absorption spectroscopy and electron paramagnetic resonance spectroscopy. These metallomicelles speed the cleavage of the substrates p-nitrophenyl hexanoate or p-nitrophenyl diphenyl phosphate. To ascertain the nature of the active esterolytic species, the stoichiometries of the respective Cu(II) complexes were determined from the kinetic version of Job's plot. In all the instances, 2:1 complex ligand/Cu(II) ion are the most kinetically competent species. The apparent pK(a) values of the Cu(II)-coordinated hydroxyl groups of the ligands 3, 4, and 5, in the comicellar aggregate, are 7.8, 8.0, and 8.0, respectively, as estimated from the rate constant vs pH: profiles of the ester cleavage reactions. The nucleophilic metallomicellar reagents and the second-order "catalytic" rate constants toward esterolysis of the substrate p-nitrophenyl hexanoate (at 25 degrees C, pH 7.6) are 37.5 for 3, 11.4 for 4, and 13.8 for 5. All catalytic systems comprising the coaggregates of 3, 4, or 5 and CTABr demonstrate turnover behavior in the presence of excess substrate.