984 resultados para speaker attribution


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speaker attribution is the task of annotating a spoken audio archive based on speaker identities. This can be achieved using speaker diarization and speaker linking. In our previous work, we proposed an efficient attribution system, using complete-linkage clustering, for conducting attribution of large sets of two-speaker telephone data. In this paper, we build on our proposed approach to achieve a robust system, applicable to multiple recording domains. To do this, we first extend the diarization module of our system to accommodate multi-speaker (>2) recordings. We achieve this through using a robust cross-likelihood ratio (CLR) threshold stopping criterion for clustering, as opposed to the original stopping criterion of two speakers used for telephone data. We evaluate this baseline diarization module across a dataset of Australian broadcast news recordings, showing a significant lack of diarization accuracy without previous knowledge of the true number of speakers within a recording. We thus propose applying an additional pass of complete-linkage clustering to the diarization module, demonstrating an absolute improvement of 20% in diarization error rate (DER). We then evaluate our proposed multi-domain attribution system across the broadcast news data, demonstrating achievable attribution error rates (AER) as low as 17%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research makes a major contribution which enables efficient searching and indexing of large archives of spoken audio based on speaker identity. It introduces a novel technique dubbed as “speaker attribution” which is the task of automatically determining ‘who spoke when?’ in recordings and then automatically linking the unique speaker identities within each recording across multiple recordings. The outcome of the research will also have significant impact in improving the performance of automatic speech recognition systems through the extracted speaker identities.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Speaker diarization determines instances of the same speaker within a recording. Extending this task to a collection of recordings for linking together segments spoken by a unique speaker requires speaker linking. In this paper we propose a speaker linking system using linkage clustering and state-of-the-art speaker recognition techniques. We evaluate our approach against two baseline linking systems using agglomerative cluster merging (AC) and agglomerative clustering with model retraining (ACR). We demonstrate that our linking method, using complete-linkage clustering, provides a relative improvement of 20% and 29% in attribution error rate (AER), over the AC and ACR systems, respectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a clustering-only approach to the problem of speaker diarization to eliminate the need for the commonly employed and computationally expensive Viterbi segmentation and realignment stage. We use multiple linear segmentations of a recording and carry out complete-linkage clustering within each segmentation scenario to obtain a set of clustering decisions for each case. We then collect all clustering decisions, across all cases, to compute a pairwise vote between the segments and conduct complete-linkage clustering to cluster them at a resolution equal to the minimum segment length used in the linear segmentations. We use our proposed cluster-voting approach to carry out speaker diarization and linking across the SAIVT-BNEWS corpus of Australian broadcast news data. We compare our technique to an equivalent baseline system with Viterbi realignment and show that our approach can outperform the baseline technique with respect to the diarization error rate (DER) and attribution error rate (AER).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Provides brief summaries of school improvement legislation passed by the 85th session of the 84th Illinois General Assembly.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

"June 1995."

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation goes into the new field from applied linguistics called forensic linguistics, which studies the language as an evidence for criminal cases. There are many subfields within forensic linguistics, however, this study belongs to authorship attribution analysis, where the authorship of a text is attributed to an author through an exhaustive linguistic analysis. Within this field, this study analyzes the morphosyntactic and discursive-pragmatic variables that remain constant in the intra-variation or personal style of a speaker in the oral and written discourse, and at the same time have a high difference rate in the interspeaker variation, or from one speaker to another. The theoretical base of this study is the term used by professor Maria Teresa Turell called “idiolectal style”. This term establishes that the idiosyncratic choices that the speaker makes from the language build a style for each speaker that is constant in the intravariation of the speaker’s discourse. This study comes as a consequence of the problem appeared in authorship attribution analysis, where the absence of some known texts impedes the analysis for the attribution of the authorship of an uknown text. Thus, through a methodology based on qualitative analysis, where the variables are studied exhaustively, and on quantitative analysis, where the findings from qualitative analysis are statistically studied, some conclusions on the evidence of such variables in both oral and written discourses will be drawn. The results of this analysis will lead to further implications on deeper analyses where larger amount of data can be used.