73 resultados para Noise-vocoded Speech
em University of Queensland eSpace - Australia
Resumo:
The purpose of this study was to explore the potential advantages, both theoretical and applied, of preserving low-frequency acoustic hearing in cochlear implant patients. Several hypotheses are presented that predict that residual low-frequency acoustic hearing along with electric stimulation for high frequencies will provide an advantage over traditional long-electrode cochlear implants for the recognition of speech in competing backgrounds. A simulation experiment in normal-hearing subjects demonstrated a clear advantage for preserving low-frequency residual acoustic hearing for speech recognition in a background of other talkers, but not in steady noise. Three subjects with an implanted "short-electrode" cochlear implant and preserved low-frequency acoustic hearing were also tested on speech recognition in the same competing backgrounds and compared to a larger group of traditional cochlear implant users. Each of the three short-electrode subjects performed better than any of the traditional long-electrode implant subjects for speech recognition in a background of other talkers, but not in steady noise, in general agreement with the simulation studies. When compared to a subgroup of traditional implant users matched according to speech recognition ability in quiet, the short-electrode patients showed a 9-dB advantage in the multitalker background. These experiments provide strong preliminary support for retaining residual low-frequency acoustic hearing in cochlear implant patients. The results are consistent with the idea that better perception of voice pitch, which can aid in separating voices in a background of other talkers, was responsible for this advantage.
Resumo:
The purpose of the present study was to examine the benefits of providing audible speech to listeners with sensorineural hearing loss when the speech is presented in a background noise. Previous studies have shown that when listeners have a severe hearing loss in the higher frequencies, providing audible speech (in a quiet background) to these higher frequencies usually results in no improvement in speech recognition. In the present experiments, speech was presented in a background of multitalker babble to listeners with various severities of hearing loss. The signal was low-pass filtered at numerous cutoff frequencies and speech recognition was measured as additional high-frequency speech information was provided to the hearing-impaired listeners. It was found in all cases, regardless of hearing loss or frequency range, that providing audible speech resulted in an increase in recognition score. The change in recognition as the cutoff frequency was increased, along with the amount of audible speech information in each condition (articulation index), was used to calculate the "efficiency" of providing audible speech. Efficiencies were positive for all degrees of hearing loss. However, the gains in recognition were small, and the maximum score obtained by an listener was low, due to the noise background. An analysis of error patterns showed that due to the limited speech audibility in a noise background, even severely impaired listeners used additional speech audibility in the high frequencies to improve their perception of the "easier" features of speech including voicing
Resumo:
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1–2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition
Resumo:
Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these anchor signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the built-in synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or anchors to improve the reliability of clinical perceptual voice evaluation. (C) 2002 Acoustical Society of America.
Resumo:
The present study compared the ability of school-aged children with and without a history of otitis media (OM) to understand everyday speech in noise using the University of Queensland Understanding of Everyday Speech Test (UQUEST). Participants were 484 children (246 boys, 238 girls) attending Grade 3 (272, mean age = 8.25 yr., SD = 0.43) and Grade 4 (212, mean age = 9.28 yr., SD = 0.41) at 19 primary schools in Brisbane metropolitan and Sunshine Coast schools. Children selected for inclusion were native speakers of English with normal hearing on the day of testing and had no reported physical or behavioral impairments. The children were divided into three groups according to the number of episodes of OM since birth. The results showed no significant differences in speech scores across the participant groups. However, a significant difference in mean speech scores was found across the grades and the noise conditions. Although children with a history of OM performed equally well at a group level when compared to the controls, they exhibited a large range of abilities in speech comprehension within the same group.
Resumo:
The primary objective of this study was to assess the lingual kinematic strategies used by younger and older adults to increase rate of speech. It was hypothesised that the strategies used by the older adults would differ from the young adults either as a direct result of, or in response to a need to compensate for, age-related changes in the tongue. Electromagnetic articulography was used to examine the tongue movements of eight young (M526.7 years) and eight older (M567.1 years) females during repetitions of /ta/ and /ka/ at a controlled moderate rate and then as fast as possible. The younger and older adults were found to significantly reduce consonant durations and increase syllable repetition rate by similar proportions. To achieve these reduced durations both groups appeared to use the same strategy, that of reducing the distances travelled by the tongue. Further comparisons at each rate, however, suggested a speed-accuracy trade-off and increased speech monitoring in the older adults. The results may assist in differentiating articulatory changes associated with normal aging from pathological changes found in disorders that affect the older population.
Resumo:
The present study examined effects of ear asymmetry, handedness, and gender on distortion-product otoacoustic emissions (DPOAEs) obtained from schoolchildren. A total of 1003 children (528 boys and 475 girls), with a mean age of 6.2 years (SD = 0.4, range = 5.2-7.9 years), were tested in a quiet room at their schools using the GSI-60 DPOAE system. A distortion-product (DP)-gram was obtained for each ear, with f2 varying from 1.1 to 6.0 kHz and the ratio of f2/f1 at 1.21. The signal-to-noise ratios (SNRs) (DPOAE amplitude minus the mean noise floor) at the tested frequencies 1.1, 1.5, 1.9, 2.4, 3.0, 3.8, 4.8, and 6.0 kHz were measured. The results revealed a small but significant difference in SNR between ears, with right ears showing a higher mean SNR than left ears at 1.9, 3.0, 3.8, and 6.0 kHz. At these frequencies, the difference in mean SNR between ears was less than 1 dB. A significant gender effect was also found. Girls exhibited a higher SNR than boys at 3.8, 4.8, and 6.0 kHz. The difference in mean SNR, as a result of the gender effect, was about 1 to 2 dB at these frequencies. There was no significant difference in mean SNR between left-handed and right-handed children for all tested frequencies.
Resumo:
This study examined the test performance of distortion product otoacoustic emissions (DPOAEs) when used as a screening tool in the school setting. A total of 1003 children (mean age 6.2 years, SD = 0.4) were tested with pure-tone screening, tympanometry, and DPOAE assessment. Optimal DPOAE test performance was determined in comparison with pure-tone screening results using clinical decision analysis. The results showed hit rates of 0.86, 0.89, and 0.90, and false alarm rates of 0.52, 0.19, and 0.22 for criterion signal-to-noise ratio (SNR) values of 4, 5, and 11 dB at 1.1, 1.9, and 3.8 kHz respectively. DPOAE test performance was compromised at 1.1 kHz. In view of the different test performance characteristics across the frequencies, the use of a fixed SNR as a pass criterion for all frequencies in DPOAE assessments is not recommended. When compared to pure tone plus tympanometry results, the DPOAEs showed deterioration in test performance, suggesting that the use of DPOAEs alone might miss children with subtle middle ear dysfunction. However, when the results of a test protocol, which incorporates both DPOAEs and tympanometry, were used in comparison with the gold standard of pure-tone screening plus tympanometry, test performance was enhanced. In view of its high performance, the use of a protocol that includes both DPOAEs and tympanometry holds promise as a useful tool in the hearing screening of schoolchildren, including difficult-to-test children.
Resumo:
The differences in spectral shape resolution abilities among cochlear implant ~CI! listeners, and between CI and normal-hearing ~NH! listeners, when listening with the same number of channels ~12!, was investigated. In addition, the effect of the number of channels on spectral shape resolution was examined. The stimuli were rippled noise signals with various ripple frequency-spacings. An adaptive 4IFC procedure was used to determine the threshold for resolvable ripple spacing, which was the spacing at which an interchange in peak and valley positions could be discriminated. The results showed poorer spectral shape resolution in CI compared to NH listeners ~average thresholds of approximately 3000 and 400 Hz, respectively!, and wide variability among CI listeners ~range of approximately 800 to 8000 Hz!. There was a significant relationship between spectral shape resolution and vowel recognition. The spectral shape resolution thresholds of NH listeners increased as the number of channels increased from 1 to 16, while the CI listeners showed a performance plateau at 4–6 channels, which is consistent with previous results using speech recognition measures. These results indicate that this test may provide a measure of CI performance which is time efficient and non-linguistic, and therefore, if verified, may provide a useful contribution to the prediction of speech perception in adults and children who use CIs.
Resumo:
We analyze the quantum dynamics of radiation propagating in a single-mode optical fiber with dispersion, nonlinearity, and Raman coupling to thermal phonons. We start from a fundamental Hamiltonian that includes the principal known nonlinear effects and quantum-noise sources, including linear gain and loss. Both Markovian and frequency-dependent, non-Markovian reservoirs are treated. This treatment allows quantum Langevin equations, which have a classical form except for additional quantum-noise terms, to be calculated. In practical calculations, it is more useful to transform to Wigner or 1P quasi-probability operator representations. These transformations result in stochastic equations that can be analyzed by use of perturbation theory or exact numerical techniques. The results have applications to fiber-optics communications, networking, and sensor technology.
Resumo:
The purpose of this paper is to provide a cross-linguistic survey of the variation of coding strategies that are available for the grammatical distinction between direct and indirect speech representation with a particular focus on the expression of indirect reported speech. Cross-linguistic data from a sample of 42 languages will be provided to illustrate the range of available grammatical coding strategies.
Resumo:
We review the description of noise in electronic circuits in terms of electron transport. The Poisson process is used as a unifying principle. In recent years, much attention has been given to current noise in light-emitting diodes and laser diodes. In these devices, random events associated with electron transport are correlated with photon emission times, thus modifying both the current statistics and the statistics of the emitted light. We give a review of experiments in this area with special emphasis on the ability of such devices to produce subshot-noise currents and light beams. Finally we consider the noise properties of a class of mesoscopic devices based on the quantum tunnelling of an electron into and out of a bound state. We present a simple quantum model of this process which confirms that the current noise in such a device should be subshot-noise.