581 resultados para speaker diarization


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation for Speaker recognition work is presented in the first part of the thesis. An exhaustive survey of past work in this field is also presented. A low cost system not including complex computation has been chosen for implementation. Towards achieving this a PC based system is designed and developed. A front end analog to digital convertor (12 bit) is built and interfaced to a PC. Software to control the ADC and to perform various analytical functions including feature vector evaluation is developed. It is shown that a fixed set of phrases incorporating evenly balanced phonemes is aptly suited for the speaker recognition work at hand. A set of phrases are chosen for recognition. Two new methods are adopted for the feature evaluation. Some new measurements involving a symmetry check method for pitch period detection and ACE‘ are used as featured. Arguments are provided to show the need for a new model for speech production. Starting from heuristic, a knowledge based (KB) speech production model is presented. In this model, a KB provides impulses to a voice producing mechanism and constant correction is applied via a feedback path. It is this correction that differs from speaker to speaker. Methods of defining measurable parameters for use as features are described. Algorithms for speaker recognition are developed and implemented. Two methods are presented. The first is based on the model postulated. Here the entropy on the utterance of a phoneme is evaluated. The transitions of voiced regions are used as speaker dependent features. The second method presented uses features found in other works, but evaluated differently. A knock—out scheme is used to provide the weightage values for the selection of features. Results of implementation are presented which show on an average of 80% recognition. It is also shown that if there are long gaps between sessions, the performance deteriorates and is speaker dependent. Cross recognition percentages are also presented and this in the worst case rises to 30% while the best case is 0%. Suggestions for further work are given in the concluding chapter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presently different audio watermarking methods are available; most of them inclined towards copyright protection and copy protection. This is the key motive for the notion to develop a speaker verification scheme that guar- antees non-repudiation services and the thesis is its outcome. The research presented in this thesis scrutinizes the field of audio water- marking and the outcome is a speaker verification scheme that is proficient in addressing issues allied to non-repudiation to a great extent. This work aimed in developing novel audio watermarking schemes utilizing the fun- damental ideas of Fast-Fourier Transform (FFT) or Fast Walsh-Hadamard Transform (FWHT). The Mel-Frequency Cepstral Coefficients (MFCC) the best parametric representation of the acoustic signals along with few other key acoustic characteristics is employed in crafting of new schemes. The au- dio watermark created is entirely dependent to the acoustic features, hence named as FeatureMark and is crucial in this work. In any watermarking scheme, the quality of the extracted watermark de- pends exclusively on the pre-processing action and in this work framing and windowing techniques are involved. The theme non-repudiation provides immense significance in the audio watermarking schemes proposed in this work. Modification of the signal spectrum is achieved in a variety of ways by selecting appropriate FFT/FWHT coefficients and the watermarking schemes were evaluated for imperceptibility, robustness and capacity char- acteristics. The proposed schemes are unequivocally effective in terms of maintaining the sound quality, retrieving the embedded FeatureMark and in terms of the capacity to hold the mark bits. Robust nature of these marking schemes is achieved with the help of syn- chronization codes such as Barker Code with FFT based FeatureMarking scheme and Walsh Code with FWHT based FeatureMarking scheme. An- other important feature associated with this scheme is the employment of an encryption scheme towards the preparation of its FeatureMark that scrambles the signal features that helps to keep the signal features unreve- laed. A comparative study with the existing watermarking schemes and the ex- periments to evaluate imperceptibility, robustness and capacity tests guar- antee that the proposed schemes can be baselined as efficient audio water- marking schemes. The four new digital audio watermarking algorithms in terms of their performance are remarkable thereby opening more opportu- nities for further research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we examine the case of a system that cooperates with a “direct” user to plan an activity that some “indirect” user, not interacting with the system, should perform. The specific application we consider is the prescription of drugs. In this case, the direct user is the prescriber and the indirect user is the person who is responsible for performing the therapy. Relevant characteristics of the two users are represented in two user models. Explanation strategies are represented in planning operators whose preconditions encode the cognitive state of the indirect user; this allows tailoring the message to the indirect user's characteristics. Expansion of optional subgoals and selection among candidate operators is made by applying decision criteria represented as metarules, that negotiate between direct and indirect users' views also taking into account the context where explanation is provided. After the message has been generated, the direct user may ask to add or remove some items, or change the message style. The system defends the indirect user's needs as far as possible by mentioning the rationale behind the generated message. If needed, the plan is repaired and the direct user model is revised accordingly, so that the system learns progressively to generate messages suited to the preferences of people with whom it interacts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This Forum challenges and problematizes the term incomplete acquisition, which has been widely used to describe the state of competence of heritage speaker (HS) bilinguals for well over a decade (see, e.g., Montrul, 2008). It is suggested and defended that HS competence, while often different from monolingual peers, is in fact not incomplete (given any reasonable definition by the word incomplete), but simply distinct for reasons related to the realities of their environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It has been argued that colloquial dialects of Brazilian Portuguese (BP) have undergone significant linguistic change resulting in the loss of inflected infinitives (e.g., Pires, 2002, 2006). Since BP adults, at least educated ones, have complete knowledge of inflected infinitives, the implicit claim is that they are transmitted via formal education in the standard dialect. In the present article, I test one of the latent predictions of such claims; namely, the fact that heritage speakers of BP who lack formal education in the standard dialect should never develop native-like knowledge of inflected infinitives. In doing so, I highlight two significant implications (a) that heritage speaker grammars are a good source for testing dialectal variation and language change proposals and (b) incomplete acquisition and/or attrition are not the only sources of heritage language competence differences. Employing the syntactic and semantic tests of Rothman and Iverson (2007), I compare heritage speakers' knowledge to Rothman and Iverson's advanced adult L2 learners and educated native controls. Unlike the latter groups, the data for heritage speakers indicate that they do not have target knowledge of inflected infinitives, lending support to Pires' claims, suggesting that literacy plays a significant role in the acquisition of this grammatical property in BP.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This Forum challenges the generally accepted position in the linguistic sciences—conscious or not—that monolingualism and nativeness are essentially synonymous in an exclusive way. We discuss two consequences of our position that naturalistic bilinguals and multilinguals exposed to a language in early childhood are also native speakers: (i) that bi-/multilinguals have multiple native languages;and (ii) nativeness can be applicable to a state of linguistic knowledge that is characterized by significant differences to the monolingual baseline.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several studies of different bilingual groups including L2 learners, child bilinguals, heritage speakers and L1 attriters reveal similar performance on syntax-discourse interface properties such as anaphora resolution (Sorace, 2011 and references therein). Specifically, bilinguals seem to allow more optionality in the interpretation of overt subject pronouns in null subject languages, such as Greek, Italian and Spanish while the interpretation of null subject pronouns is indistinguishable from monolingual natives. Nevertheless, there is some evidence pointing to bilingualism effects on the interpretation of null subject pronouns too in heritage speakers’ grammars (Montrul, 2004) due to some form of ‘arrested’ development in this group of bilinguals. The present study seeks to investigate similarities and differences between two Greek–Swedish bilingual groups, heritage speakers and L1 attriters, in anaphora resolution of null and overt subject pronouns in Greek using a self-paced listening with a sentence-picture matching decision task at the end of each sentence. The two groups differ in crucial ways: heritage speakers were simultaneous or early bilinguals while the L1 attriters were adult learners of the second language, Swedish. Our findings reveal differences from monolingual preferences in the interpretation of the overt pronoun for both heritage and attrited speakers while the differences attested between the two groups in the interpretation of null subject pronouns affect only response times with heritage being faster than attrited speakers. We argue that our results do not support an age of onset or differential input effects on bilingual performance in pronoun resolution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Spanish captures the difference between eventive and stative passives via an obligatory choice between two copula; verbal passives take the copula ser and adjectival passives take the copula estar. In this study, we compare and contrast US and Canadian heritage speakers of Spanish on their knowledge of this difference in relation to copula choice in Spanish. The backgrounds of the target groups differ significantly from each other in that only one of them, the Canadian group, has grown up in a societal multilingual environment. We discuss the results as being supportive of two non-mutually exclusive explanation factors: (a) French facilitates (bootstraps) the acquisition of eventive and stative passives and/or (b) the US/Canadian HS differences (e.g. status of bilingualism and the languages at stake) is a reflection of the uniqueness of the language contact situations and the effects this has on the input HSS receive.