104 resultados para automatic speech recognition
Resumo:
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1–2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition
Resumo:
The purpose of this study was to explore the potential advantages, both theoretical and applied, of preserving low-frequency acoustic hearing in cochlear implant patients. Several hypotheses are presented that predict that residual low-frequency acoustic hearing along with electric stimulation for high frequencies will provide an advantage over traditional long-electrode cochlear implants for the recognition of speech in competing backgrounds. A simulation experiment in normal-hearing subjects demonstrated a clear advantage for preserving low-frequency residual acoustic hearing for speech recognition in a background of other talkers, but not in steady noise. Three subjects with an implanted "short-electrode" cochlear implant and preserved low-frequency acoustic hearing were also tested on speech recognition in the same competing backgrounds and compared to a larger group of traditional cochlear implant users. Each of the three short-electrode subjects performed better than any of the traditional long-electrode implant subjects for speech recognition in a background of other talkers, but not in steady noise, in general agreement with the simulation studies. When compared to a subgroup of traditional implant users matched according to speech recognition ability in quiet, the short-electrode patients showed a 9-dB advantage in the multitalker background. These experiments provide strong preliminary support for retaining residual low-frequency acoustic hearing in cochlear implant patients. The results are consistent with the idea that better perception of voice pitch, which can aid in separating voices in a background of other talkers, was responsible for this advantage.
Resumo:
The purpose of the present study was to examine the benefits of providing audible speech to listeners with sensorineural hearing loss when the speech is presented in a background noise. Previous studies have shown that when listeners have a severe hearing loss in the higher frequencies, providing audible speech (in a quiet background) to these higher frequencies usually results in no improvement in speech recognition. In the present experiments, speech was presented in a background of multitalker babble to listeners with various severities of hearing loss. The signal was low-pass filtered at numerous cutoff frequencies and speech recognition was measured as additional high-frequency speech information was provided to the hearing-impaired listeners. It was found in all cases, regardless of hearing loss or frequency range, that providing audible speech resulted in an increase in recognition score. The change in recognition as the cutoff frequency was increased, along with the amount of audible speech information in each condition (articulation index), was used to calculate the "efficiency" of providing audible speech. Efficiencies were positive for all degrees of hearing loss. However, the gains in recognition were small, and the maximum score obtained by an listener was low, due to the noise background. An analysis of error patterns showed that due to the limited speech audibility in a noise background, even severely impaired listeners used additional speech audibility in the high frequencies to improve their perception of the "easier" features of speech including voicing
Resumo:
The role of polarisation in late time complex resonance based target identification is investigated numerically for the case of an L-shaped wire. While repeated extraction of the resonances for varying polarisation allows for better signal-to-noise immunity, it is also found that there are preferred polarisations for each complex resonance. The first few of these polarisations are extracted for the sample target.
Resumo:
This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU speech database system with the aim of labelling departures from native-speaker pronunciation. An analysis of prosodic features was made using ToBI framework. The results show many phonetic and prosodic processes which make non-native speakers’ speech distinct from native ones. The corpusbased methodology of analysing foreign accent may have implications for the evaluation of non-native accent, accented speech recognition and computer assisted pronunciation- learning.
Resumo:
The differences in spectral shape resolution abilities among cochlear implant ~CI! listeners, and between CI and normal-hearing ~NH! listeners, when listening with the same number of channels ~12!, was investigated. In addition, the effect of the number of channels on spectral shape resolution was examined. The stimuli were rippled noise signals with various ripple frequency-spacings. An adaptive 4IFC procedure was used to determine the threshold for resolvable ripple spacing, which was the spacing at which an interchange in peak and valley positions could be discriminated. The results showed poorer spectral shape resolution in CI compared to NH listeners ~average thresholds of approximately 3000 and 400 Hz, respectively!, and wide variability among CI listeners ~range of approximately 800 to 8000 Hz!. There was a significant relationship between spectral shape resolution and vowel recognition. The spectral shape resolution thresholds of NH listeners increased as the number of channels increased from 1 to 16, while the CI listeners showed a performance plateau at 4–6 channels, which is consistent with previous results using speech recognition measures. These results indicate that this test may provide a measure of CI performance which is time efficient and non-linguistic, and therefore, if verified, may provide a useful contribution to the prediction of speech perception in adults and children who use CIs.
Resumo:
A 77-year-old man with 8 year progressive language deterioration in the face of grossly intact memory was followed. No acute or chronic physiological or psychological event was associated with symptom onset. CT revealed small left basal ganglia infarct. Mild atrophy, no lacunar infarcts, mild diffuse periventricular changes registered on MRI. Gait normal but slow. Speech hesitant and sparse. Affect euthymic; neurobehavioral disturbance absent. MMSE 26/30; clock incorrect, concrete. Neuropsychological testing revealed simple attention intact; complex attention, processing speed impaired. Visuospatial copying and delayed recall of copy average with some perseveration. Apraxia absent. Recall mildly impaired. Mild deficits in planning, organization apparent. Patient severely aphasic, dysarthric without paraphasias. Repetition of automatic speech, recitation moderately impaired; prosody intact. Understanding of written language, nonverbal communication abilities, intact. Frontal release signs developed over last 12 months. Repeated cognitive testing revealed mild deterioration across all domains with significant further decrease in expressive, receptive language. Neurobehavioral changes remain absent to date; he remains interested, engaged and independent in basic ADLs. Speech completely deteriorated; gait and movements appreciably slowed. Although signs of frontal/executive dysfunction present, lack of behavioral abnormalities, psychiatric disturbance, personality change argue against focal or progressive frontal impairment or dementia. Relative intactness of memory and comprehension argue against Alzheimer’s disease. Lack of findings on neuroimaging argue against CVA or tumor. It is possible that the small basal ganglia infarct has resulted in a mild lateral prefrontal syndrome. However, the absence of depression as well as the relatively circumscribed language problem suggests otherwise. The progressive, severe nature of language impairments, with relatively minor impairments in attention and memory, argues for a possible diagnosis of primary progressive aphasia.
Resumo:
These are the full proceedings of the conference.
Resumo:
The nature of the semantic memory deficit in dementia of the Alzheimer's type (DAT) was investigated in a semantic priming task which was designed to assess both automatic and attention-induced priming effects. Ten DAT patients and 10 age-matched control subjects completed a word naming semantic priming task in which both relatedness proportion (RP) and stimulus-onset asynchrony (SOA) were varied. A clear dissociation between automatic and attentional priming effects in both groups was demonstrated; however, the DAT subjects pattern of priming deviated significantly from that of the normal controls. The DAT patients failed to produce any priming under conditions which encouraged automatic semantic processing and produced facilitation only when the RP was high. In addition, the DAT group produced hyperpriming, with significantly larger facilitation effects than the control group. These results suggest an impairment of automatic spreading activation in DAT and have implications for theories of semantic memory impairment in DAT as well as models of normal priming. (C) 2001 Academic Press.
Resumo:
The impact of basal ganglia dysfunction on semantic processing was investigated by comparing the performance of individuals with nonthalamic subcortical (NS) vascular lesions, Parkinson's disease (PD), cortical lesions, and matched controls on a semantic priming task. Unequibiased lexical ambiguity primes were used in auditory prime-target pairs comprising 4 critical conditions; dominant related (e.g., bank-money), subordinate related (e.g., bank-river), dominant unrelated (e.g.,foot-money) and subordinate unrelated (e.g., bat-river). Participants made speeded lexical decisions (word/nonword) on targets using a go-no-go response. When a short prime-target interstimulus interval (ISI) of 200 ins was employed, all groups demonstrated priming for dominant and subordinate conditions, indicating nonselective meaning facilitation and intact automatic lexical processing. Differences emerged at the long ISI (1250 ms), where control and cortical lesion participants evidenced selective facilitation of the dominant meaning, whereas NS and PD groups demonstrated a protracted period of nonselective meaning facilitation. This finding suggests a circumscribed deficit in the selective attentional engagement of the semantic network on the basis of meaning frequency, possibly implicating a disturbance of frontal-subcortical systems influencing inhibitory semantic mechanisms.
Resumo:
To investigate the effects of dopamine on the dynamics of semantic activation, 39 healthy volunteers were randomly assigned to ingest either a placebo (n = 24) or a levodopa (it = 16) capsule. Participants then performed a lexical decision task that implemented a masked priming paradigm. Direct and indirect semantic priming was measured across stimulus onset asynchronies (SOAs) of 250, 500 and 1200 ms. The results revealed significant direct and indirect semantic priming effects for the placebo group at SOAs of 250 ms and 500 ms, but no significant direct or indirect priming effects at the 1200 ms SOA. In contrast, the levodopa group showed significant direct and indirect semantic priming effects at the 250 ms SOA, while no significant direct or indirect priming effects were evident at the SOAs of 500 ins or 1200 ms. These results suggest that dopamine has a role in modulating both automatic and attentional aspects of semantic activation according to a specific time course. The implications of these results for current theories of dopaminergic modulation of semantic activation are discussed.
Resumo:
The McGurk effect, in which auditory [ba] dubbed onto [go] lip movements is perceived as da or tha, was employed in a real-time task to investigate auditory-visual speech perception in prelingual infants. Experiments 1A and 1B established the validity of real-time dubbing for producing the effect. In Experiment 2, 4(1)/(2)-month-olds were tested in a habituation-test paradigm, in which 2 an auditory-visual stimulus was presented contingent upon visual fixation of a live face. The experimental group was habituated to a McGurk stimulus (auditory [ba] visual [ga]), and the control group to matching auditory-visual [ba]. Each group was then presented with three auditory-only test trials, [ba], [da], and [deltaa] (as in then). Visual-fixation durations in test trials showed that the experimental group treated the emergent percept in the McGurk effect, [da] or [deltaa], as familiar (even though they had not heard these sounds previously) and [ba] as novel. For control group infants [da] and [deltaa] were no more familiar than [ba]. These results are consistent with infants'perception of the McGurk effect, and support the conclusion that prelinguistic infants integrate auditory and visual speech information. (C) 2004 Wiley Periodicals, Inc.
Resumo:
Children with autistic spectrum disorder (ASD) may have poor audio-visual integration, possibly reflecting dysfunctional 'mirror neuron' systems which have been hypothesised to be at the core of the condition. In the present study, a computer program, utilizing speech synthesizer software and a 'virtual' head (Baldi), delivered speech stimuli for identification in auditory, visual or bimodal conditions. Children with ASD were poorer than controls at recognizing stimuli in the unimodal conditions, but once performance on this measure was controlled for, no group difference was found in the bimodal condition. A group of participants with ASD were also trained to develop their speech-reading ability. Training improved visual accuracy and this also improved the children's ability to utilize visual information in their processing of speech. Overall results were compared to predictions from mathematical models based on integration and non-integration, and were most consistent with the integration model. We conclude that, whilst they are less accurate in recognizing stimuli in the unimodal condition, children with ASD show normal integration of visual and auditory speech stimuli. Given that training in recognition of visual speech was effective, children with ASD may benefit from multi-modal approaches in imitative therapy and language training. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Recognising the laterality of a pictured hand involves making an initial decision and confirming that choice by mentally moving one's own hand to match the picture. This depends on an intact body schema. Because patients with complex regional pain syndrome type 1 (CRPS1) take longer to recognise a hand's laterality when it corresponds to their affected hand, it has been proposed that nociceptive input disrupts the body schema. However, chronic pain is associated with physiological and psychosocial complexities that may also explain the results. In three studies, we investigated whether the effect is simply due to nociceptive input. Study one evaluated the temporal and perceptual characteristics of acute hand pain elicited by intramuscular injection of hypertonic saline into the thenar eminence. In studies two and three, subjects performed a hand laterality recognition task before, during, and after acute experimental hand pain, and experimental elbow pain, respectively. During hand pain and during elbow pain, when the laterality of the pictured hand corresponded to the painful side, there was no effect on response time (RT). That suggests that nociceptive input alone is not sufficient to disrupt the working body schema. Conversely to patients with CRPS1, when the laterality of the pictured hand corresponded to the non-painful hand, RT increased similar to 380 ms (95% confidence interval 190 ms-590 ms). The results highlight the differences between acute and chronic pain and may reflect a bias in information processing in acute pain toward the affected part.