997 resultados para Speaker recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of back- ground noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed with a wide variety of noise conditions, environmental reverberant responses, and signal-to-noise ratios, this protocol provides a solid basis for the development, evaluation and benchmarking of robust speaker recognition algorithms, and is freely available to download alongside the QUT-NOISE database. In this work, we use the QUT-NOISE-SRE protocol to evaluate a state-of-the-art PLDA i-vector speaker recognition system, demonstrating the importance of designing voice-activity-detection front-ends specifically for speaker recognition, rather than aiming for perfect coherence with the true speech/non-speech boundaries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an automatic speaker recognition system for intelligence applications. The system has to provide functionalities for a speaker skimming application in which databases of recorded conversations belonging to an ongoing investigation can be annotated and quickly browsed by an operator. The paper discusses the criticalities introduced by the characteristics of the audio signals under consideration - in particular background noise and channel/coding distortions - as well as the requirements and functionalities of the system under development. It is shown that the performance of state-of-the-art approaches degrades significantly in presence of moderately high background noise. Finally, a novel speaker recognizer based on phonetic features and an ensemble classifier is presented. Results show that the proposed approach improves performance on clean audio, and suggest that it can be employed towards improved real-world robustness. © EURASIP, 2009.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We studied the application of Biomimetic Pattern Recognition to speaker recognition. A speaker recognition neural network using network matching degree as criterion is proposed. It has been used in the system of text-dependent speaker recognition. Experimental results show that good effect could be obtained even with lesser samples. Furthermore, the misrecognition caused by untrained speakers occurring in testing could be controlled effectively. In addition, the basic idea "cognition" of Biomimetic Pattern Recognition results in no requirement of retraining the old system for enrolling new speakers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We studied the application of Biomimetic Pattern Recognition to speaker recognition. A speaker recognition neural network using network matching degree as criterion is proposed. It has been used in the system of text-dependent speaker recognition. Experimental results show that good effect could be obtained even with lesser samples. Furthermore, the misrecognition caused by untrained speakers occurring in testing could be controlled effectively. In addition, the basic idea "cognition" of Biomimetic Pattern Recognition results in no requirement of retraining the old system for enrolling new speakers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the problem of speaker identi-fication and verification in noisy conditions, assuming that speechsignals are corrupted by environmental noise, but knowledgeabout the noise characteristics is not available. This research ismotivated in part by the potential application of speaker recog-nition technologies on handheld devices or the Internet. Whilethe technologies promise an additional biometric layer of securityto protect the user, the practical implementation of such systemsfaces many challenges. One of these is environmental noise. Due tothe mobile nature of such systems, the noise sources can be highlytime-varying and potentially unknown. This raises the require-ment for noise robustness in the absence of information about thenoise. This paper describes a method that combines multicondi-tion model training and missing-feature theory to model noisewith unknown temporal-spectral characteristics. Multiconditiontraining is conducted using simulated noisy data with limitednoise variation, providing a “coarse” compensation for the noise,and missing-feature theory is applied to refine the compensationby ignoring noise variation outside the given training conditions,thereby reducing the training and testing mismatch. This paperis focused on several issues relating to the implementation of thenew model for real-world applications. These include the gener-ation of multicondition training data to model noisy speech, thecombination of different training data to optimize the recognitionperformance, and the reduction of the model’s complexity. Thenew algorithm was tested using two databases with simulated andrealistic noisy speech data. The first database is a redevelopmentof the TIMIT database by rerecording the data in the presence ofvarious noise types, used to test the model for speaker identifica-tion with a focus on the varieties of noise. The second database isa handheld-device database collected in realistic noisy conditions,used to further validate the model for real-world speaker verifica-tion. The new model is compared to baseline systems and is foundto achieve lower error rates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a novel method for performing speaker recognition with very limited training data and in the presence of background noise. Similarity-based speaker recognition is considered so that speaker models can be created with limited training speech data. The proposed similarity is a form of cosine similarity used as a distance measure between speech feature vectors. Each speech frame is modelled using subband features, and into this framework, multicondition training and optimal feature selection are introduced, making the system capable of performing speaker recognition in the presence of realistic, time-varying noise, which is unknown during training. Speaker identi?cation experiments were carried out using the SPIDRE database. The performance of the proposed new system for noise compensation is compared to that of an oracle model; the speaker identi?cation accuracy for clean speech by the new system trained with limited training data is compared to that of a GMM trained with several minutes of speech. Both comparisons have demonstrated the effectiveness of the new model. Finally, experiments were carried out to test the new model for speaker identi?cation given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speaker recognition is the process of automatically recognizing the speaker by analyzing individual information contained in the speech waves. In this paper, we discuss the development of an intelligent system for text-dependent speaker recognition. The system comprises two main modules, a wavelet-based signal-processing module for feature extraction of speech waves, and an artificial-neural-network-based classifier module to identify and categorize the speakers. Wavelet is used in de-noising and in compressing the speech signals. The wavelet family that we used is the Daubechies Wavelets. After extracting the necessary features from the speech waves, the features were then fed to a neural-network-based classifier to identify the speakers. We have implemented the Fuzzy ARTMAP (FAM) network in the classifier module to categorize the de-noised and compressed signals. The proposed intelligent learning system has been applied to a case study of text-dependent speaker recognition problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An intelligent system for text-dependent speaker recognition is proposed in this paper. The system consists of a wavelet-based module as the feature extractor of speech signals and a neural-network-based module as the signal classifier. The Daubechies wavelet is employed to filter and compress the speech signals. The fuzzy ARTMAP (FAM) neural network is used to classify the processed signals. A series of experiments on text-dependent gender and speaker recognition are conducted to assess the effectiveness of the proposed system using a collection of vowel signals from 100 speakers. A variety of operating strategies for improving the FAM performance are examined and compared. The experimental results are analyzed and discussed.