936 resultados para Speech recogntion


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech is often a multimodal process, presented audiovisually through a talking face. One area of speech perception influenced by visual speech is speech segmentation, or the process of breaking a stream of speech into individual words. Mitchel and Weiss (2013) demonstrated that a talking face contains specific cues to word boundaries and that subjects can correctly segment a speech stream when given a silent video of a speaker. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2013). In Experiment 1, subjects were found to spend the most time watching the eyes and mouth, with a trend suggesting that the mouth was viewed more than the eyes. Although subjects displayed significant learning of word boundaries, performance was not correlated with gaze duration on any individual feature, nor was performance correlated with a behavioral measure of autistic-like traits. However, trends suggested that as autistic-like traits increased, gaze duration of the mouth increased and gaze duration of the eyes decreased, similar to significant trends seen in autistic populations (Boratston & Blakemore, 2007). In Experiment 2, the same video was modified so that a black bar covered the eyes or mouth. Both videos elicited learning of word boundaries that was equivalent to that seen in the first experiment. Again, no correlations were found between segmentation performance and SRS scores in either condition. These results, taken with those in Experiment, suggest that neither the eyes nor mouth are critical to speech segmentation and that perhaps more global head movements indicate word boundaries (see Graf, Cosatto, Strom, & Huang, 2002). Future work will elucidate the contribution of individual features relative to global head movements, as well as extend these results to additional types of speech tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Telephone communication is a challenge for many hearing-impaired individuals. One important technical reason for this difficulty is the restricted frequency range (0.3-3.4 kHz) of conventional landline telephones. Internet telephony (voice over Internet protocol [VoIP]) is transmitted with a larger frequency range (0.1-8 kHz) and therefore includes more frequencies relevant to speech perception. According to a recently published, laboratory-based study, the theoretical advantage of ideal VoIP conditions over conventional telephone quality has translated into improved speech perception by hearing-impaired individuals. However, the speech perception benefits of nonideal VoIP network conditions, which may occur in daily life, have not been explored. VoIP use cannot be recommended to hearing-impaired individuals before its potential under more realistic conditions has been examined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim was to investigate the effect of different speech tasks, i.e. recitation of prose (PR), alliteration (AR) and hexameter (HR) verses and a control task (mental arithmetic (MA) with voicing of the result on end-tidal CO2 (PETCO2), cerebral hemodynamics and oxygenation. CO2 levels in the blood are known to strongly affect cerebral blood flow. Speech changes breathing pattern and may affect CO2 levels. Measurements were performed on 24 healthy adult volunteers during the performance of the 4 tasks. Tissue oxygen saturation (StO2) and absolute concentrations of oxyhemoglobin ([O2Hb]), deoxyhemoglobin ([HHb]) and total hemoglobin ([tHb]) were measured by functional near-infrared spectroscopy (fNIRS) and PETCO2 by a gas analyzer. Statistical analysis was applied to the difference between baseline before the task, 2 recitation and 5 baseline periods after the task. The 2 brain hemispheres and 4 tasks were tested separately. A significant decrease in PETCO2 was found during all 4 tasks with the smallest decrease during the MA task. During the recitation tasks (PR, AR and HR) a statistically significant (p < 0.05) decrease occurred for StO2 during PR and AR in the right prefrontal cortex (PFC) and during AR and HR in the left PFC. [O2Hb] decreased significantly during PR, AR and HR in both hemispheres. [HHb] increased significantly during the AR task in the right PFC. [tHb] decreased significantly during HR in the right PFC and during PR, AR and HR in the left PFC. During the MA task, StO2 increased and [HHb] decreased significantly during the MA task. We conclude that changes in breathing (hyperventilation) during the tasks led to lower CO2 pressure in the blood (hypocapnia), predominantly responsible for the measured changes in cerebral hemodynamics and oxygenation. In conclusion, our findings demonstrate that PETCO2 should be monitored during functional brain studies investigating speech using neuroimaging modalities, such as fNIRS, fMRI to ensure a correct interpretation of changes in hemodynamics and oxygenation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

From the moment of their birth, a person's life is determined by their sex. Ms. Goroshko wants to know why this difference is so striking, why society is so concerned to sustain it, and how it is able to persist even when certain national or behavioural stereotypes are erased between people. She is convinced of the existence of not only social, but biological differences between men and women, and set herself the task, in a manuscript totalling 126 pages, written in Ukrainian and including extensive illustrations, of analysing these distinctions as they are manifested in language. She points out that, even before 1900, certain stylistic differences between the ways that men and women speak had been noted. Since then it has become possible, for instance in the case of Japanese, to point to examples of male and female sub-languages. In general, one can single out the following characteristics. Males tend to write with less fluency, to refer to events in a verb-phrase, to be time-oriented, to involve themselves more in their references to events, to locate events in their personal sphere of activity, and to refer less to others. Therefore, concludes Ms Goroshko, the male is shown to be more active, more ego-involved in what he does, and less concerned about others. Women, in contrast, were more fluent, referred to events in a noun-phrase, were less time-oriented, tended to be less involved in their event-references, locate events within their interactive community and refer more to others. They spent much more time discussing personal and domestic subjects, relationship problems, family, health and reproductive matters, weight, food and clothing, men, and other women. As regards discourse strategies, Ms Goroshko notes the following. Men more often begin a conversation, they make more utterances, these utterances are longer, they make more assertions, speak less carefully, generally determine the topic of conversation, speak more impersonally, use more vulgar expressions, and use fewer diminutives and more imperatives. Women's speech strategies, apart from being the opposite of those enumerated above, also contain more euphemisms, polite forms, apologies, laughter and crying. All of the above leads Ms. Goroshko to conclude that the differences between male and female speech forms are more striking than the similarities. Furthermore she is convinced that the biological divergence between the sexes is what generates the verbal divergence, and that social factors can only intensify or diminish the differentiation in verbal behaviour established by the sex of a person. Bearing all this in mind, Ms Goroshko set out to construct a grammar of male and female styles of speaking within Russian. One of her most important research tools was a certain type of free association test. She took a list comprising twelve stimuli (to love, to have, to speak, to fuck, a man, a woman, a child, the sky, a prayer, green, beautiful) and gave it to a group of participants specially selected, according to a preliminary psychological testing, for the high levels of masculinity or femininity they displayed. Preliminary responses revealed that the female reactions were more diverse than the male ones, there were more sentences and word combinations in the female reactions, men gave more negative responses to the stimulus and sometimes didn't want to react at all, women reacted more to adjectives and men to nouns, and that, surprisingly, women coloured more negatively their reactions to the words man, to love and a child (Ms. Goroshko is inclined to attribute this to the present economic situation in Russia). Another test performed by Ms. Goroshko was the so-called "defective text" developed by A.A. Brudny. All participants were distributed with packets of complete sentences, which had been taken from a text and then mixed at random. The task was to reconstruct the original text. There were three types of test, the first descriptive, the second narrative, and the third logical. Ms. Goroshko created computer programmes to analyse the results. She found that none of the reconstructed texts was coincident with the original, differing both from the original text and amongst themselves and that there were many more disparities in the male than the female texts. In the descriptive and logical texts the differences manifested themselves more clearly in the male texts, and in the narrative texts in the female texts. The widest dispersal of values was observed at the outset, while the female text ending was practically coincident with the original (in contrast to the male ending). The greatest differences in text reconstruction for both males and females were registered in the middle of the texts. Women, Ms. Goroshko claims, were more sensitive to the semantic structure of the texts, since they assembled the narrative text much more accurately than the other two, while the men assembled more accurately the logical text. Texts written by women were assembled more accurately by women and texts by men by men. On the basis of computer analysis, Ms. Goroshko found that female speech was substantially more emotional. It was expressed by various means, hyperbole, metaphor, comparisons, epithets, ways of enumeration, and with the aid of interjections, rhetorical questions, exclamations. The level of literacy was higher for female speech, and there were fewer mistakes in grammar and spelling in female texts. The last stage of Ms Goroshko's research concerned the social stereotypes of beliefs about men and women in Russian society today. A large number of respondents were asked questions such as "What merits must a woman possess?", "What are male vices and virtues?", etc. After statistical manipulation, an image of modern man and woman, as it exists in the minds of modern Russian men and women, emerged. Ms. Goroshko believes that her findings are significant not only within the field of linguistics. She has already successfully worked on anonymous texts and been able to decide on the sex of the author and consequently believes that in the future her research may even be of benefit to forensic science.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nadeina set out to develop methods of speech development in Russian as a mother tongue, focusing on improving diction, training in voice quality control, intonation control, the removal of dialect, and speech etiquette. She began with training in the receptive skills of language, i.e. reading and listening, since the interpretation of someone else's language plays an important role in language production. Her studies of students' reading speed of students showed that it varies between 40 and 120 words per minute, which is normally considered very slow. She discovered a strong correlation between speed of reading and speaking skills: the slower a person reads the worse is their ability to speak and has designed exercises to improve reading skills. Nadeina also believes that listening to other people's speech is very important, both to analyse its content and in some cases as an example, so listening skills need to be developed. Many people have poor pronunciation habits acquired as children. On the basis of speech samples from young Russians (male and female, aged 17-22), Nadeina analysed the commonest speech faults - nasalisation, hesitation and hemming at the end of sense-groups, etc. Using a group of twenty listeners, she looked for a correlation between how voice quality is perceived and certain voice quality parameters, e.g. pitch range, tremulousness, fluency, whispering, harshness, sonority, tension and audible breath. She found that the less non-linguistic segment variations in speech appeared, the more attractive the speech was rated. The results are included in a textbook aimed at helping people to improve their oral skills and to communicate ideas to an audience. She believes this will assist Russian officials in their attempts to communicate their ideas to different social spheres, and also foreigners learning Russian.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND AND OBJECTIVE: In the Swiss version of the Freiburg speech intelligibility test five test words from the original German recording which are rarely used in Switzerland have been exchanged. Furthermore, differences in the transfer functions between headphone and loudspeaker presentation are not taken into account during calibration. New settings for the levels of the individual test words in the recommended recording and small changes in calibration procedures led us to make a verification of the currently used normative values.PATIENTS AND METHODS: Speech intelligibility was measured in 20 subjects with normal hearing using monosyllabic words and numbers via headphones and loudspeakers.RESULTS: On average, 50% speech intelligibility was reached at levels which were 7.5 dB lower under free-field conditions than for headphone presentation. The average difference between numbers and monosyllabic words was found to be 9.6 dB, which is considerably lower than the 14 dB of the current normative curves.CONCLUSIONS: There is a good agreement between our measurements and the normative values for tests using monosyllabic words and headphones, but not for numbers or free-field measurements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech melody or prosody subserves linguistic, emotional, and pragmatic functions in speech communication. Prosodic perception is based on the decoding of acoustic cues with a predominant function of frequency-related information perceived as speaker's pitch. Evaluation of prosodic meaning is a cognitive function implemented in cortical and subcortical networks that generate continuously updated affective or linguistic speaker impressions. Various brain-imaging methods allow delineation of neural structures involved in prosody processing. In contrast to functional magnetic resonance imaging techniques, DC (direct current, slow) components of the EEG directly measure cortical activation without temporal delay. Activation patterns obtained with this method are highly task specific and intraindividually reproducible. Studies presented here investigated the topography of prosodic stimulus processing in dependence on acoustic stimulus structure and linguistic or affective task demands, respectively. Data obtained from measuring DC potentials demonstrated that the right hemisphere has a predominant role in processing emotions from the tone of voice, irrespective of emotional valence. However, right hemisphere involvement is modulated by diverse speech and language-related conditions that are associated with a left hemisphere participation in prosody processing. The degree of left hemisphere involvement depends on several factors such as (i) articulatory demands on the perceiver of prosody (possibly, also the poser), (ii) a relative left hemisphere specialization in processing temporal cues mediating prosodic meaning, and (iii) the propensity of prosody to act on the segment level in order to modulate word or sentence meaning. The specific role of top-down effects in terms of either linguistically or affectively oriented attention on lateralization of stimulus processing is not clear and requires further investigations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CONCLUSIONS: Speech understanding is better with the Baha Divino than with the Baha Compact in competing noise from the rear. No difference was found for speech understanding in quiet. Subjectively, overall sound quality and speech understanding were rated better for the Baha Divino. OBJECTIVES: To compare speech understanding in quiet and in noise and subjective ratings for two different bone-anchored hearing aids: the recently developed Baha Divino and the Baha Compact. PATIENTS AND METHODS: Seven adults with bilateral conductive or mixed hearing losses who were users of a bone-anchored hearing aid were tested with the Baha Compact in quiet and in noise. Tests were repeated after 3 months of use with the Baha Divino. RESULTS: There was no significant difference between the two types of Baha for speech understanding in quiet when tested with German numbers and monosyllabic words at presentation levels between 50 and 80 dB. For speech understanding in noise, an advantage of 2.3 dB for the Baha Divino vs the Baha Compact was found, if noise was emitted from a loudspeaker to the rear of the listener and the directional microphone noise reduction system was activated. Subjectively, the Baha Divino was rated statistically significantly better in terms of overall sound quality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech coding might have an impact on music perception of cochlear implant users. This questionnaire study compares the musical activities and perception of postlingually deafened cochlear implant users with three different coding strategies (CIS, ACE, SPEAK) using the Munich Music Questionnaire. Overall, the self-reported perception of music of CIS, SPEAK, and ACE users did not differ by very much.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Open-ended interviews of 90 min length of 38 patients were analyzed with respect to speech stylistics, shown by Schucker and Jacobs to differentiate individuals with type A personality features from those with type B. In our patients, Type A/B had been assessed by the Bortner Personality Inventory. The stylistics studied were: repeated words swallowed words, interruptions, simultaneous speech, silence latency (between question and answer) (SL), speed of speech, uneven speed of speech (USS), explosive words (PW), uneven speech volume (USV), and speech volume. Correlations between both raters for all speech categories were high. Positive correlations between extent of type A and SL (r = 0.33; p = 0.022), USS (r = 0.51; p = 0.002), PW (r = 0.46; p = 0.003) and USV (r = 0.39; p = 0.012) were found. Our results indicate that the speech in nonstress open-ended interviews of type A individuals tends to show a higher emotional tension (positive correlations for USS PW and USV) and is more controlled in conversation (positive correlation for SL).