3 resultados para Speech Processing
em Instituto Politécnico do Porto, Portugal
Resumo:
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese. A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented and tested by using online newspaper articles. The obtained experimental results gave rise to 98.80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/ tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of Text-to-Speech systems (TTS).
Resumo:
In this work an adaptive modeling and spectral estimation scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for speech enhancement. Both speech and noise signals are modeled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. The model parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The speech enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. This approach is particularly useful as a pre-processing module for parametric based speech recognition systems that rely on spectral time dependent models. The system performance has been evaluated by a set of human listeners and by spectral distances. In both cases the use of this pre-processing module has led to improved results.
Resumo:
The recent developments on Hidden Markov Models (HMM) based speech synthesis showed that this is a promising technology fully capable of competing with other established techniques. However some issues still lack a solution. Several authors report an over-smoothing phenomenon on both time and frequencies which decreases naturalness and sometimes intelligibility. In this work we present a new vowel intelligibility enhancement algorithm that uses a discrete Kalman filter (DKF) for tracking frame based parameters. The inter-frame correlations are modelled by an autoregressive structure which provides an underlying time frame dependency and can improve time-frequency resolution. The system’s performance has been evaluated using objective and subjective tests and the proposed methodology has led to improved results.