4 resultados para Speaker
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
This paper presents a study on wavelets and their characteristics for the specific purpose of serving as a feature extraction tool for speaker verification (SV), considering a Radial Basis Function (RBF) classifier, which is a particular type of Artificial Neural Network (ANN). Examining characteristics such as support-size, frequency and phase responses, amongst others, we show how Discrete Wavelet Transforms (DWTs), particularly the ones which derive from Finite Impulse Response (FIR) filters, can be used to extract important features from a speech signal which are useful for SV. Lastly, an SV algorithm based on the concepts presented is described.
Resumo:
Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms for small arrays still have two significant limitations: lack of range resolution, and accuracy degradation with increasing reverberation. The latter is natural and expected, given that strong reflections can have amplitudes similar to that of the direct signal, but different directions of arrival. Therefore, correctly modeling the room and compensating for the reflections should reduce the degradation due to reverberation. In this paper, we show a stronger result. If modeled correctly, early reflections can be used to provide more information about the source location than would have been available in an anechoic scenario. The modeling not only compensates for the reverberation, but also significantly increases resolution for range and elevation. Thus, we show that under certain conditions and limitations, reverberation can be used to improve SSL performance. Prior attempts to compensate for reverberation tried to model the room impulse response (RIR). However, RIRs change quickly with speaker position, and are nearly impossible to track accurately. Instead, we build a 3-D model of the room, which we use to predict early reflections, which are then incorporated into the SSL estimation. Simulation results with real and synthetic data show that even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.
Resumo:
This four-experiment series sought to evaluate the potential of children with neurosensory deafness and cochlear implants to exhibit auditory-visual and visual-visual stimulus equivalence relations within a matching-to-sample format. Twelve children who became deaf prior to acquiring language (prelingual) and four who became deaf afterwards (postlingual) were studied. All children learned auditory-visual conditional discriminations and nearly all showed emergent equivalence relations. Naming tests, conducted with a subset of the: children, showed no consistent relationship to the equivalence-test outcomes.. This study makes several contributions: to the literature on stimulus equivalence. First; it demonstrates that both pre- and postlingually deaf children-can: acquire auditory-visual equivalence-relations after cochlear implantation, thus demonstrating symbolic functioning. Second, it directs attention to a population that may be especially interesting for researchers seeking to analyze the relationship. between speaker and listener repertoires. Third, it demonstrates the feasibility of conducting experimental studies of stimulus control processes within the limitations of a hospital, which these children must visit routinely for the maintenance of their cochlear implants.
Resumo:
In this paper we present a new wavelet-based algorithm for low-cost computation of the cepstrum. It can be used for real time precise pitch determination in automatic speech and speaker recognition systems. Many wavelet families are examined to determine the one that works best. The results confirm the efficacy and accuracy of the proposed technique for pitch extraction. (C) 2008 Elsevier B.V. All rights reserved.