581 resultados para Speaker
Resumo:
General note: Title and date provided by Bettye Lane.
Resumo:
General note: Title and date provided by Bettye Lane.
Resumo:
General note: Title and date provided by Bettye Lane.
Resumo:
Inscriptions: Verso: [stamped] Credit must be given to Freda Leinwand from Monkmeyer Press Photo Service.
Resumo:
Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.
Resumo:
Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.
Resumo:
This study contemplates reports and reflections about gender and the interfaces with work, power and woman's political participation within the Bororo indigenous communities in Mato Grosso, Guarani/Kaiowá and Kadiwéu ones, in Mato Grosso do Sul. In the study with the Bororo community, the woman valorization occurred because she represents the guardian of the culture and of the traditional knowledge, and at the same time, she is an important speaker for the Bororo and the non indigenous society. In the case of Guarani/Kaiowá community, the most important facts are, on one side, the departure of the men and their wish to become city men, and on the other, the women who wish or need to keep the Guarani identity and live in the reserve. In the Kadiwéu community, the most important fact is the women political power and a role division between men and women, without the attribution of more value to one role or the other.
Resumo:
Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms for small arrays still have two significant limitations: lack of range resolution, and accuracy degradation with increasing reverberation. The latter is natural and expected, given that strong reflections can have amplitudes similar to that of the direct signal, but different directions of arrival. Therefore, correctly modeling the room and compensating for the reflections should reduce the degradation due to reverberation. In this paper, we show a stronger result. If modeled correctly, early reflections can be used to provide more information about the source location than would have been available in an anechoic scenario. The modeling not only compensates for the reverberation, but also significantly increases resolution for range and elevation. Thus, we show that under certain conditions and limitations, reverberation can be used to improve SSL performance. Prior attempts to compensate for reverberation tried to model the room impulse response (RIR). However, RIRs change quickly with speaker position, and are nearly impossible to track accurately. Instead, we build a 3-D model of the room, which we use to predict early reflections, which are then incorporated into the SSL estimation. Simulation results with real and synthetic data show that even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.
Resumo:
The paper disputes two influential claims in the Romance Linguistics literature. The first is that the synthetic future tenses in spoken Western Romance are now rivalled, if not supplanted, as temporal functors by the more recently developed GO futures. The second is that these synthetic futures now have modal rather than temporal meanings in spoken Romance. These claims are seen as reflecting a universal cycle of diachronic change, in which verb forms originally expressing modal (or aspectual) values take on future temporal reference, becoming tenses. The new modal meanings supplant the temporal, which are then taken up by new forms. Challenges to this theory for French are raised on the basis of empirical evidence of two sorts. Positively, future tenses in spoken Romance continue to be used with temporal meaning. Negatively, evidence of modal meaning for these forms is lacking. The evidence comes froma corpora of spoken French, native speaker judgements and verb data from a daily broadsheet. Cumulatively, it points to the reverse of the claims noted above: the synthetic future in spoken French has temporal but little modal meaning.
Resumo:
Contrary to the common pattern of spatial terms being metaphorically extended to location in time, the Australian language Jingulu shows an unusual extension of temporal markers to indicate location in space. Light verbs, which typically encode tense, aspect, mood and associated motion, are occasionally found on nouns to indicate the relative location of the referent with respect to the speaker. It is hypothesised that this pattern resulted from the reduction of verbal clauses used as relative modifiers to the nouns in question.
Resumo:
Using the framework of communication accommodation theory the authors examined convergence and maintenance on evaluations of Chinese and Australian students. In Study 1, Australian students judged interactions between an Anglo-Australian. and another interactant who either maintained his or converged in speech style. Results indicated that participants were aware of convergence but that speaker ethnicity (Anglo-Australian, Chinese Australian or Chinese national) was a stronger influence on evaluations and future intentions to interact with the speaker In Study 2, Australian students judged Chinese speakers who maintained communication style or converged on interpersonal speech markers, intergroup markers, or both types of markers. Results indicated that the more participants defined themselves in intergroup terms, the more positively they judged intergroup convergence relative to interpersonal convergence and maintenance. This points to the importance of distinguishing between, convergence on interpersonal and intergroup speech markers, and underlines the role of individual differences in the evaluation of convergence.
Resumo:
Measures of vocal intensity, frequency and harshness were compared for 19 hearing-impaired and 21 normal-hearing people over 60 years of age. Significantly greater comfortable intensity levels were found in the hearing-impaired group, but the other measures of frequency and harshness were not significantly different. A large proportion of the subjects in both groups reported a history of gastro-oesophageal reflux (GER), a condition associated with vocal fold pathology and hoarseness. Comparison of the GER and non-GER subjects on the measures of vocal function showed that the female GER speaker exhibited lower frequency on the vowel /u/ than the non-GER subjects. Clinicians need to be aware of the effect of highly prevalent disorders such as hearing impairment and GER on the voices of elderly speakers.
Resumo:
This four-experiment series sought to evaluate the potential of children with neurosensory deafness and cochlear implants to exhibit auditory-visual and visual-visual stimulus equivalence relations within a matching-to-sample format. Twelve children who became deaf prior to acquiring language (prelingual) and four who became deaf afterwards (postlingual) were studied. All children learned auditory-visual conditional discriminations and nearly all showed emergent equivalence relations. Naming tests, conducted with a subset of the: children, showed no consistent relationship to the equivalence-test outcomes.. This study makes several contributions: to the literature on stimulus equivalence. First; it demonstrates that both pre- and postlingually deaf children-can: acquire auditory-visual equivalence-relations after cochlear implantation, thus demonstrating symbolic functioning. Second, it directs attention to a population that may be especially interesting for researchers seeking to analyze the relationship. between speaker and listener repertoires. Third, it demonstrates the feasibility of conducting experimental studies of stimulus control processes within the limitations of a hospital, which these children must visit routinely for the maintenance of their cochlear implants.
Resumo:
Teachers who are new to the country often find themselves as 'the stranger' in their own classroom. Languages education is one area where such overseas-educated teachers are common. The study reported here investigated what cultural factors might influence the classroom performance of such teachers. The early classroom experience of beginning Japanese native speaker teachers and trainees was examined to this end.