Biblioteca Digital

74 resultados para Speech Enhancement

Speech enhancement in noisy environments for video retrieval

Relevância:

100.00% 100.00%

Publicador:

Veja mais

A Corpus-Based Approach to Speech Enhancement from Nonstationary Noise

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Temporal dynamics and speaker characteristics are two important features of speech that distinguish speech from noise. In this paper, we propose a method to maximally extract these two features of speech for speech enhancement. We demonstrate that this can reduce the requirement for prior information about the noise, which can be difficult to estimate for fast-varying noise. Given noisy speech, the new approach estimates clean speech by recognizing long segments of the clean speech as whole units. In the recognition, clean speech sentences, taken from a speech corpus, are used as examples. Matching segments are identified between the noisy sentence and the corpus sentences. The estimate is formed by using the longest matching segments found in the corpus sentences. Longer speech segments as whole units contain more distinct dynamics and richer speaker characteristics, and can be identified more accurately from noise than shorter speech segments. Therefore, estimation based on the longest recognized segments increases the noise immunity and hence the estimation accuracy. The new approach consists of a statistical model to represent up to sentence-long temporal dynamics in the corpus speech, and an algorithm to identify the longest matching segments between the noisy sentence and the corpus sentences. The algorithm is made more robust to noise uncertainty by introducing missing-feature based noise compensation into the corpus sentences. Experiments have been conducted on the TIMIT database for speech enhancement from various types of nonstationary noise including song, music, and crosstalk speech. The new approach has shown improved performance over conventional enhancement algorithms in both objective and subjective evaluations.

Veja mais

Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the separation and recognition of overlapped speech sentences assuming single-channel observation. A system based on a combination of several different techniques is proposed. The system uses a missing-feature approach for improving crosstalk/noise robustness, a Wiener filter for speech enhancement, hidden Markov models for speech reconstruction, and speaker-dependent/-independent modeling for speaker and speech recognition. We develop the system on the Speech Separation Challenge database, involving a task of separating and recognizing two mixing sentences without assuming advanced knowledge about the identity of the speakers nor about the signal-to-noise ratio. The paper is an extended version of a previous conference paper submitted for the challenge.

Veja mais

An iterative longest matching segment approach to speech enhancement with additive noise and channel distortion

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new approach to speech enhancement from single-channel measurements involving both noise and channel distortion (i.e., convolutional noise), and demonstrates its applications for robust speech recognition and for improving noisy speech quality. The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise for speech estimation. Third, we present an iterative algorithm which updates the noise and channel estimates of the corpus data model. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement.

Veja mais

Speech Enhancement from Additive Noise and Channel Distortion - a Corpus-Based Approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new approach to single-channel speech enhancement involving both noise and channel distortion (i.e., convolutional noise). The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise. Third, we present an iterative algorithm for improved speech estimates. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement. Index Terms: corpus-based speech model, longest matching segment, speech enhancement, speech recognition

Veja mais

Wide Matching - An Approach to Improving Noise Robustness for Speech Enhancement

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is shown that under certain conditions it is possible to obtain a good speech estimate from noise without requiring noise estimation. We study an implementation of the theory, namely wide matching, for speech enhancement. The new approach performs sentence-wide joint speech segment estimation subject to maximum recognizability to gain noise robustness. Experiments have been conducted to evaluate the new approach with variable noises and SNRs from -5 dB to noise free. It is shown that the new approach, without any estimation of the noise, significantly outperformed conventional methods in the low SNR conditions while retaining comparable performance in the high SNR conditions. It is further suggested that the wide matching and deep learning approaches can be combined towards a highly robust and accurate speech estimator.

Veja mais

Irish English Speech Acquisition

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Transcribing speech: The segmental and prosodic layers

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Can patients with chronic schizophrenia express emotion? A speech analysis

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Cooperative enhancement of insulinotropic action of GLP-1 by acetylcholine uncovers paradoxical inhibitory effect of beta cell muscarinic receptor activation on adenylate cyclase activity.

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Direct detection of ion pairs by fluorescence enhancement

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Passive scalar mixing enhancement in a boundary layer using a perpendicular synthetic jet

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Behavioural and neurochemical mechanisms of mild stress in the enhancement of feeding

Relevância:

20.00% 20.00%

Publicador:

Veja mais

An operant determination of the behavioural mechanism of benzodiazepine enhancement of food intake

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rationale A recent review paper by Cooper (Appetite 44:133–150, 2005) has pointed out that a role for benzodiazepines as appetite stimulants has been largely overlooked. Cooper’s review cited several studies that suggested the putative mechanism of enhancement of food intake after benzodiazepine administration might involve increasing the perceived pleasantness of food (palatability). Objectives The present study examined the behavioral mechanism of increased food intake after benzodiazepine administration. Materials and methods The cyclic-ratio operant schedule has been proposed as a useful behavioral assay for differentiating palatability from regulatory effects on food intake (Ettinger and Staddon, Physiol Behav 29:455–458, 1982 and Behav Neurosci 97:639–653, 1983). The current study employed the cyclic-ratio schedule to determine whether the effects on food intake of chlordiazepoxide (CDP) (5.0 mg/kg), sodium pentobarbital (5.0 mg/kg), and picrotoxin (1.0 mg/kg) were mediated through palatability or regulatory processes. Results The results of this study show that both the benzodiazepine CDP and the barbiturate sodium pentobarbital increased food intake in a manner similar to increasing the palatability of the ingestant, and picrotoxin decreased food intake in a manner similar to decreasing the palatability of the ingestant. Conclusions These results suggest that the food intake enhancement properties of benzodiazepines are mediated through a mechanism affecting perceived palatability.

Veja mais

Speech in the Process of Becoming Bored

Relevância:

20.00% 20.00%

Publicador:

Veja mais

74 resultados para Speech Enhancement

Filtro por publicador