67 resultados para noisy speaker verification

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

80.00% 80.00%

Publicador:

Resumo:

For the first time in this paper the authors present results showing the effect of out of plane speaker head pose variation on a lip biometric based speaker verification system. Using appearance DCT based features, they adopt a Mutual Information analysis technique to highlight the class discriminant DCT components most robust to changes in out of plane pose. Experiments are conducted using the initial phase of a new multi view Audio-Visual database designed for research and development of pose-invariant speech and speaker recognition. They show that verification performance can be improved by substituting higher order horizontal DCT components for vertical, particularly in the case of a train/test pose angle mismatch.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper investigates the problem of speaker identi-fication and verification in noisy conditions, assuming that speechsignals are corrupted by environmental noise, but knowledgeabout the noise characteristics is not available. This research ismotivated in part by the potential application of speaker recog-nition technologies on handheld devices or the Internet. Whilethe technologies promise an additional biometric layer of securityto protect the user, the practical implementation of such systemsfaces many challenges. One of these is environmental noise. Due tothe mobile nature of such systems, the noise sources can be highlytime-varying and potentially unknown. This raises the require-ment for noise robustness in the absence of information about thenoise. This paper describes a method that combines multicondi-tion model training and missing-feature theory to model noisewith unknown temporal-spectral characteristics. Multiconditiontraining is conducted using simulated noisy data with limitednoise variation, providing a “coarse” compensation for the noise,and missing-feature theory is applied to refine the compensationby ignoring noise variation outside the given training conditions,thereby reducing the training and testing mismatch. This paperis focused on several issues relating to the implementation of thenew model for real-world applications. These include the gener-ation of multicondition training data to model noisy speech, thecombination of different training data to optimize the recognitionperformance, and the reduction of the model’s complexity. Thenew algorithm was tested using two databases with simulated andrealistic noisy speech data. The first database is a redevelopmentof the TIMIT database by rerecording the data in the presence ofvarious noise types, used to test the model for speaker identifica-tion with a focus on the varieties of noise. The second database isa handheld-device database collected in realistic noisy conditions,used to further validate the model for real-world speaker verifica-tion. The new model is compared to baseline systems and is foundto achieve lower error rates.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we present a novel method for performing speaker recognition with very limited training data and in the presence of background noise. Similarity-based speaker recognition is considered so that speaker models can be created with limited training speech data. The proposed similarity is a form of cosine similarity used as a distance measure between speech feature vectors. Each speech frame is modelled using subband features, and into this framework, multicondition training and optimal feature selection are introduced, making the system capable of performing speaker recognition in the presence of realistic, time-varying noise, which is unknown during training. Speaker identi?cation experiments were carried out using the SPIDRE database. The performance of the proposed new system for noise compensation is compared to that of an oracle model; the speaker identi?cation accuracy for clean speech by the new system trained with limited training data is compared to that of a GMM trained with several minutes of speech. Both comparisons have demonstrated the effectiveness of the new model. Finally, experiments were carried out to test the new model for speaker identi?cation given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We provide an analysis of basic quantum-information processing protocols under the effect of intrinsic nonidealities in cluster states. These nonidealities are based on the introduction of randomness in the entangling steps that create the cluster state and are motivated by the unavoidable imperfections faced in creating entanglement using condensed-matter systems. Aided by the use of an alternative and very efficient method to construct cluster-state configurations, which relies on the concatenation of fundamental cluster structures, we address quantum-state transfer and various fundamental gate simulations through noisy cluster states. We find that a winning strategy to limit the effects of noise is the management of small clusters processed via just a few measurements. Our study also reinforces recent ideas related to the optical implementation of a one-way quantum computer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a typical shoeprint classification and retrieval system, the first step is to segment meaningful basic shapes and patterns in a noisy shoeprint image. This step has significant influence on shape descriptors and shoeprint indexing in the later stages. In this paper, we extend a recently developed denoising technique proposed by Buades, called non-local mean filtering, to give a more general model. In this model, the expected result of an operation on a pixel can be estimated by performing the same operation on all of its reference pixels in the same image. A working pixel’s reference pixels are those pixels whose neighbourhoods are similar to the working pixel’s neighbourhood. Similarity is based on the correlation between the local neighbourhoods of the working pixel and the reference pixel. We incorporate a special instance of this general case into thresholding a very noisy shoeprint image. Visual and quantitative comparisons with two benchmarking techniques, by Otsu and Kittler, are conducted in the last section, giving evidence of the effectiveness of our method for thresholding noisy shoeprint images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE: MicroRNAs (miRNAs) play a global role in regulating gene expression and have important tissue-specific functions. Little is known about their role in the retina. The purpose of this study was to establish the retinal expression of those miRNAs predicted to target genes involved in vision. METHODS: miRNAs potentially targeting important "retinal" genes, as defined by expression pattern and implication in disease, were predicted using a published algorithm (TargetScan; Envisioneering Medical Technologies, St. Louis, MO). The presence of candidate miRNAs in human and rat retinal RNA was assessed by RT-PCR. cDNA levels for each miRNA were determined by quantitative PCR. The ability to discriminate between miRNAs varying by a single nucleotide was assessed. The activity of miR-124 and miR-29 against predicted target sites in Rdh10 and Impdh1 was tested by cotransfection of miRNA mimics and luciferase reporter plasmids. RESULTS: Sixty-seven miRNAs were predicted to target one or more of the 320 retinal genes listed herein. All 11 candidate miRNAs tested were expressed in the retina, including miR-7, miR-124, miR135a, and miR135b. Relative levels of individual miRNAs were similar between rats and humans. The Rdh10 3'UTR, which contains a predicted miR-124 target site, mediated the inhibition of luciferase activity by miR-124 mimics in cell culture. CONCLUSIONS: Many miRNAs likely to regulate genes important for retinal function are present in the retina. Conservation of miRNA retinal expression patterns from rats to humans supports evidence from other tissues that disruption of miRNAs is a likely cause of a range of visual abnormalities.