2 resultados para voice
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
This paper proposes an improved voice activity detection (VAD) algorithm using wavelet and support vector machine (SVM) for European Telecommunication Standards Institution (ETS1) adaptive multi-rate (AMR) narrow-band (NB) and wide-band (WB) speech codecs. First, based on the wavelet transform, the original IIR filter bank and pitch/tone detector are implemented, respectively, via the wavelet filter bank and the wavelet-based pitch/tone detection algorithm. The wavelet filter bank can divide input speech signal into several frequency bands so that the signal power level at each sub-band can be calculated. In addition, the background noise level can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then the proposed algorithm can apply SVM to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database with different noise conditions show that the proposed algorithm gives considerable VAD performances superior to the AMR-NB VAD Options 1 and 2, and AMR-WB VAD. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Nowadays, noninvasive methods of diagnosis have increased due to demands of the population that requires fast, simple and painless exams. These methods have become possible because of the growth of technology that provides the necessary means of collecting and processing signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this paper is to characterize healthy and pathological voice signals with the aid of relative entropy measures. Phase space reconstruction technique is also used as a way to select interesting regions of the signals. Three groups of samples were used, one from healthy individuals and the other two from people with nodule in the vocal fold and Reinke`s edema. All of them are recordings of sustained vowel /a/ from Brazilian Portuguese. The paper shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Relative entropy is well suited due to its sensibility to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. The results showed that the pathological groups had higher entropy values in accordance with other vocal acoustic parameters presented. This suggests that these techniques may improve and complement the recent voice analysis methods available for clinicians. (C) 2008 Elsevier Inc. All rights reserved.