387 resultados para speech signals
em Cambridge University Engineering Department Publications Database
Resumo:
In this paper, a Decimative Spectral estimation method based on Eigenanalysis and SVD (Singular Value Decomposition) is presented and applied to speech signals in order to estimate Formant/Bandwidth values. The underlying model decomposes a signal into complex damped sinusoids. The algorithm is applied not only on speech samples but on a small amount of the autocorrelation coefficients of a speech frame as well, for finer estimation. Correct estimation of Formant/Bandwidth values depend on the model order thus, the requested number of poles. Overall, experimentation results indicate that the proposed methodology successfully estimates formant trajectories and their respective bandwidths.
Resumo:
The objective of this paper is to propose a signal processing scheme that employs subspace-based spectral analysis for the purpose of formant estimation of speech signals. Specifically, the scheme is based on decimative spectral estimation that uses Eigenanalysis and SVD (Singular Value Decomposition). The underlying model assumes a decomposition of the processed signal into complex damped sinusoids. In the case of formant tracking, the algorithm is applied on a small amount of the autocorrelation coefficients of a speech frame. The proposed scheme is evaluated on both artificial and real speech utterances from the TIMIT database. For the first case, comparative results to standard methods are provided which indicate that the proposed methodology successfully estimates formant trajectories.
Resumo:
This work addresses the problem of deriving F0 from distanttalking speech signals acquired by a microphone network. The method here proposed exploits the redundancy across the channels by jointly processing the different signals. To this purpose, a multi-microphone periodicity function is derived from the magnitude spectrum of all the channels. This function allows to estimate F0 reliably, even under reverberant conditions, without the need of any post-processing or smoothing technique. Experiments, conducted on real data, showed that the proposed frequency-domain algorithm is more suitable than other time-domain based ones.
Resumo:
Accurate estimation of the instantaneous frequency of speech resonances is a hard problem mainly due to phase discontinuities in the speech signal associated with excitation instants. We review a variety of approaches for enhanced frequency and bandwidth estimation in the time-domain and propose a new cognitively motivated approach using filterbank arrays. We show that by filtering speech resonances using filters of different center frequency, bandwidth and shape, the ambiguity in instantaneous frequency estimation associated with amplitude envelope minima and phase discontinuities can be significantly reduced. The novel estimators are shown to perform well on synthetic speech signals with frequency and bandwidth micro-modulations (i.e., modulations within a pitch period), as well as on real speech signals. Filterbank arrays, when applied to frequency and bandwidth modulation index estimation, are shown to reduce the estimation error variance by 85% and 70% respectively. © 2013 IEEE.
Resumo:
Statistical model-based methods are presented for the reconstruction of autocorrelated signals in impulsive plus continuous noise environments. Signals are modelled as autoregressive and noise sources as discrete and continuous mixtures of Gaussians, allowing for robustness in highly impulsive and non-Gaussian environments. Markov Chain Monte Carlo methods are used for reconstruction of the corrupted waveforms within a Bayesian probabilistic framework and results are presented for contaminated voice and audio signals.
Resumo:
In this paper we derive the a posteriori probability for the location of bursts of noise additively superimposed on a Gaussian AR process. The theory is developed to give a sequentially based restoration algorithm suitable for real-time applications. The algorithm is particularly appropriate for digital audio restoration, where clicks and scratches may be modelled as additive bursts of noise. Experiments are carried out on both real audio data and synthetic AR processes and Significant improvements are demonstrated over existing restoration techniques. © 1995 IEEE