173 resultados para Commencement Speaker


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes techniques to improve the performance of i-vector based speaker verification systems when only short utterances are available. Short-length utterance i-vectors vary with speaker, session variations, and the phonetic content of the utterance. Well established methods such as linear discriminant analysis (LDA), source-normalized LDA (SN-LDA) and within-class covariance normalisation (WCCN) exist for compensating the session variation but we have identified the variability introduced by phonetic content due to utterance variation as an additional source of degradation when short-duration utterances are used. To compensate for utterance variations in short i-vector speaker verification systems using cosine similarity scoring (CSS), we have introduced a short utterance variance normalization (SUVN) technique and a short utterance variance (SUV) modelling approach at the i-vector feature level. A combination of SUVN with LDA and SN-LDA is proposed to compensate the session and utterance variations and is shown to provide improvement in performance over the traditional approach of using LDA and/or SN-LDA followed by WCCN. An alternative approach is also introduced using probabilistic linear discriminant analysis (PLDA) approach to directly model the SUV. The combination of SUVN, LDA and SN-LDA followed by SUV PLDA modelling provides an improvement over the baseline PLDA approach. We also show that for this combination of techniques, the utterance variation information needs to be artificially added to full-length i-vectors for PLDA modelling.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper analyses the probabilistic linear discriminant analysis (PLDA) speaker verification approach with limited development data. This paper investigates the use of the median as the central tendency of a speaker’s i-vector representation, and the effectiveness of weighted discriminative techniques on the performance of state-of-the-art length-normalised Gaussian PLDA (GPLDA) speaker verification systems. The analysis within shows that the median (using a median fisher discriminator (MFD)) provides a better representation of a speaker when the number of representative i-vectors available during development is reduced, and that further, usage of the pair-wise weighting approach in weighted LDA and weighted MFD provides further improvement in limited development conditions. Best performance is obtained using a weighted MFD approach, which shows over 10% improvement in EER over the baseline GPLDA system on mismatched and interview-interview conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Australian Associated Motor Insurers Ltd v McPaul; Council of the City of Gold Coast v McPaul [2005] QSC 278 the applicant insurer sought an order requiring a claimant who had been injured in a motor vehicle accident some years earlier when he was five years old to commence a proceeding to determine the question of the applicant's liability to him. The applicant's interest in seeking the order was to avoid the prejudice which could follow from further delay, particularly delay until the respondent became obliged to commence proceedings to avoid a limitations bar.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a combination of source-normalized weighted linear discriminant analysis (SN-WLDA) and short utterance variance (SUV) PLDA modelling to improve the short utterance PLDA speaker verification. As short-length utterance i-vectors vary with the speaker, session variations and phonetic content of the utterance (utterance variation), a combined approach of SN-WLDA projection and SUV PLDA modelling is used to compensate the session and utterance variations. Experimental studies have found that a combination of SN-WLDA and SUV PLDA modelling approach shows an improvement over baseline system (WCCN[LDA]-projected Gaussian PLDA (GPLDA)) as this approach effectively compensates the session and utterance variations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner. We aim to show that the information obtained through the first pass of speaker diarization can be reused to refine and improve the original diarization results. We call this technique speaker rediarization and demonstrate the practical application of our rediarization algorithm using a large archive of two-speaker telephone conversation recordings. We use the NIST 2008 SRE summed telephone corpora for evaluating our speaker rediarization system. This corpus contains recurring speaker identities across independent recording sessions that need to be linked across the entire corpus. We show that our speaker rediarization scheme can take advantage of inter-session speaker information, linked in the initial diarization pass, to achieve a 30% relative improvement over the original diarization error rate (DER) after only two iterations of rediarization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel method for improving hierarchical speaker clustering in the tasks of speaker diarization and speaker linking. In hierarchical clustering, a tree can be formed that demonstrates various levels of clustering. We propose a ratio that expresses the impact of each cluster on the formation of this tree and use this to rescale cluster scores. This provides score normalisation based on the impact of each cluster. We use a state-of-the-art speaker diarization and linking system across the SAIVT-BNEWS corpus to show that our proposed impact ratio can provide a relative improvement of 16% in diarization error rate (DER).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This PhD research has provided novel solutions to three major challenges which have prevented the wide spread deployment of speaker recognition technology: (1) combating enrolment/ verification mismatch, (2) reducing the large amount of development and training data that is required and (3) reducing the duration of speech required to verify a speaker. A range of applications of speaker recognition technology from forensics in criminal investigations to secure access in banking will benefit from the research outcomes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today, small-medium sized enterprises (SMEs) collectively contribute to the largest percentage of job creation in OECD countries. SMEs have become increasingly international since the turn of the century despite being smaller in size in comparison to large multinational firms, and notably, exporting is the most favoured mode of international market entry utilised by SMEs in their internationalisation strategy. Governments around the world have acknowledged the importance of export promotion and have employed policies that are targeted at increasing the export activity of SMEs. However, in many countries, the involvement of SMEs in export operations remains rather low. Within Australia, for example, only about one-third of local SMEs are exporting and this raises an important question as to why there is such a huge percentage of non-exporters. Much scholarly research that focuses on this problem has concentrated on the broad concept of 'export barriers' that act as obstacles to a firm's export development. This paper takes a different approach to previous studies and proposes that a firm's resistance to commence exporting can be better understood through an analysis of the behavioural decision process during its pre-export state. Using a sample of Australian SMEs, the factors that are important in preventing a firm’s initial export commencement decision are categorised and discussed through the use of factor analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experimental studies have found that when the state-of-the-art probabilistic linear discriminant analysis (PLDA) speaker verification systems are trained using out-domain data, it significantly affects speaker verification performance due to the mismatch between development data and evaluation data. To overcome this problem we propose a novel unsupervised inter dataset variability (IDV) compensation approach to compensate the dataset mismatch. IDV-compensated PLDA system achieves over 10% relative improvement in EER values over out-domain PLDA system by effectively compensating the mismatch between in-domain and out-domain data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we introduce a novel domain-invariant covariance normalization (DICN) technique to relocate both in-domain and out-domain i-vectors into a third dataset-invariant space, providing an improvement for out-domain PLDA speaker verification with a very small number of unlabelled in-domain adaptation i-vectors. By capturing the dataset variance from a global mean using both development out-domain i-vectors and limited unlabelled in-domain i-vectors, we could obtain domain- invariant representations of PLDA training data. The DICN- compensated out-domain PLDA system is shown to perform as well as in-domain PLDA training with as few as 500 unlabelled in-domain i-vectors for NIST-2010 SRE and 2000 unlabelled in-domain i-vectors for NIST-2008 SRE, and considerable relative improvement over both out-domain and in-domain PLDA development if more are available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a clustering-only approach to the problem of speaker diarization to eliminate the need for the commonly employed and computationally expensive Viterbi segmentation and realignment stage. We use multiple linear segmentations of a recording and carry out complete-linkage clustering within each segmentation scenario to obtain a set of clustering decisions for each case. We then collect all clustering decisions, across all cases, to compute a pairwise vote between the segments and conduct complete-linkage clustering to cluster them at a resolution equal to the minimum segment length used in the linear segmentations. We use our proposed cluster-voting approach to carry out speaker diarization and linking across the SAIVT-BNEWS corpus of Australian broadcast news data. We compare our technique to an equivalent baseline system with Viterbi realignment and show that our approach can outperform the baseline technique with respect to the diarization error rate (DER) and attribution error rate (AER).