Biblioteca Digital

Speaker verification is the process of verifying the identity of a person by analysing their speech. There are several important applications for automatic speaker verification (ASV) technology including suspect identification, tracking terrorists and detecting a person’s presence at a remote location in the surveillance domain, as well as person authentication for phone banking and credit card transactions in the private sector. Telephones and telephony networks provide a natural medium for these applications. The aim of this work is to improve the usefulness of ASV technology for practical applications in the presence of adverse conditions. In a telephony environment, background noise, handset mismatch, channel distortions, room acoustics and restrictions on the available testing and training data are common sources of errors for ASV systems. Two research themes were pursued to overcome these adverse conditions: Modelling mismatch and modelling uncertainty. To directly address the performance degradation incurred through mismatched conditions it was proposed to directly model this mismatch. Feature mapping was evaluated for combating handset mismatch and was extended through the use of a blind clustering algorithm to remove the need for accurate handset labels for the training data. Mismatch modelling was then generalised by explicitly modelling the session conditions as a constrained offset of the speaker model means. This session variability modelling approach enabled the modelling of arbitrary sources of mismatch, including handset type, and halved the error rates in many cases. Methods to model the uncertainty in speaker model estimates and verification scores were developed to address the difficulties of limited training and testing data. The Bayes factor was introduced to account for the uncertainty of the speaker model estimates in testing by applying Bayesian theory to the verification criterion, with improved performance in matched conditions. Modelling the uncertainty in the verification score itself met with significant success. Estimating a confidence interval for the "true" verification score enabled an order of magnitude reduction in the average quantity of speech required to make a confident verification decision based on a threshold. The confidence measures developed in this work may also have significant applications for forensic speaker verification tasks.

Veja mais

Implementation and comparison of time-frequency representations for speech recognition

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Morphological scale-space with application to three-dimensional object recognition

Relevância:

20.00% 20.00%

Publicador:

Veja mais

The professionalisation of podiatry : recognition of the need for continuing professional education

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Emotional competence of Chinese and Australian children : the recognition of facial expressions of emotion and the understanding of display rules

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Use of a scale pattern recognition system to discriminate between stocks of fish and its implications for the management of inland recreational fisheries

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Innovative and entrepreneurial activity in the public sector : the changing face of public sector institutions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the elements which support innovative and entrepreneurial activity in New Zealand’s state owned enterprises (SOEs). An inductive case study design, involving interview data, textual analysis, and observation, was applied to three SOEs. Findings reveal that those aspects typically associated with entrepreneurship, such as innovation, risk acceptance, pro-activeness and growth, are often supported by a number of unexpected elements within the public sector. These elements include culture, branding, operational excellence, cost efficiency, and knowledge transfer. The implications are twofold. First, that innovative and entrepreneurial activity in the public sector can go beyond policy-making, with SOEs representing an important policy decision and sector of the New Zealand Government. And second, that the impact of several SOEs on international markets suggests competition on the global stage will increasingly come from both public and private sector organizations.

Veja mais

Probabilistic visual recognition of artificial landmarks for simultaneous localization and mapping

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Probabilistic robotics, most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainly to accompany observations of the environment. This paper describes how uncertainly can be characterised for a vision system that locates coloured landmark in a typical laboratory environment. The paper describes a model of the uncertainly in segmentation, the internal camera model and the mounting of the camera on the robot. It =plains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainly model,

Veja mais

The use of phase in complex spectrum subtraction for robust speech recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a new method for utilising phase information by complementing it with traditional magnitude-only spectral subtraction speech enhancement through Complex Spectrum Subtraction (CSS). The proposed approach has the following advantages over traditional magnitude-only spectral subtraction: (a) it introduces complementary information to the enhancement algorithm; (b) it reduces the total number of algorithmic parameters, and; (c) is designed for improving clean speech magnitude spectra and is therefore suitable for both automatic speech recognition (ASR) and speech perception applications. Oracle-based ASR experiments verify this approach, showing an average of 20% relative word accuracy improvements when accurate estimates of the phase spectrum are available. Based on sinusoidal analysis and assuming stationarity between observations (which is shown to be better approximated as the frame rate is increased), this paper also proposes a novel method for acquiring the phase information called Phase Estimation via Delay Projection (PEDEP). Further oracle ASR experiments validate the potential for the proposed PEDEP technique in ideal conditions. Realistic implementation of CSS with PEDEP shows performance comparable to state of the art spectral subtraction techniques in a range of 15-20 dB signal-to-noise ratio environments. These results clearly demonstrate the potential for using phase spectra in spectral subtractive enhancement applications, and at the same time highlight the need for deriving more accurate phase estimates in a wider range of noise conditions.

Veja mais

888 resultados para Face recognition makeup riconoscimento volto immagini trucco alterazione

Filtro por publicador