48 resultados para Non-thresholding speech noise reduction
em CentAUR: Central Archive University of Reading - UK
Resumo:
Interference by siren background-noise with speech transmitted from radio equipment (3) of an emergency-service vehicle, is reduced by apparatus (1) that subtracts (43) an estimate nk of the correlated siren-noise component from the contaminated signal yk supplied by the cab-microphone (2). The estimate nk is computed by FIR (finite impulse response) filtering of a siren-reference signal xk supplied by a unit (4) from one or more microphones located on or near the siren, or from the electric waveform driving the siren. The filter-coefficients wk are adjusted according to an LMS (least mean square) adaptive algorithm that is applied to the correlated-noise component ek of the fed-back noise-reduced signal, so as to bring about iterative cancellation with close frequency-tracking, of the siren noise.
Resumo:
Objective. Functional near-infrared spectroscopy (fNIRS) is an emerging technique for the in vivo assessment of functional activity of the cerebral cortex as well as in the field of brain–computer interface (BCI) research. A common challenge for the utilization of fNIRS in these areas is a stable and reliable investigation of the spatio-temporal hemodynamic patterns. However, the recorded patterns may be influenced and superimposed by signals generated from physiological processes, resulting in an inaccurate estimation of the cortical activity. Up to now only a few studies have investigated these influences, and still less has been attempted to remove/reduce these influences. The present study aims to gain insights into the reduction of physiological rhythms in hemodynamic signals (oxygenated hemoglobin (oxy-Hb), deoxygenated hemoglobin (deoxy-Hb)). Approach. We introduce the use of three different signal processing approaches (spatial filtering, a common average reference (CAR) method; independent component analysis (ICA); and transfer function (TF) models) to reduce the influence of respiratory and blood pressure (BP) rhythms on the hemodynamic responses. Main results. All approaches produce large reductions in BP and respiration influences on the oxy-Hb signals and, therefore, improve the contrast-to-noise ratio (CNR). In contrast, for deoxy-Hb signals CAR and ICA did not improve the CNR. However, for the TF approach, a CNR-improvement in deoxy-Hb can also be found. Significance. The present study investigates the application of different signal processing approaches to reduce the influences of physiological rhythms on the hemodynamic responses. In addition to the identification of the best signal processing method, we also show the importance of noise reduction in fNIRS data.
Resumo:
Alterations of existing neural networks during healthy aging, resulting in behavioral deficits and changes in brain activity, have been described for cognitive, motor, and sensory functions. To investigate age-related changes in the neural circuitry underlying overt non-lexical speech production, functional MRI was performed in 14 healthy younger (21–32 years) and 14 healthy older individuals (62–84 years). The experimental task involved the acoustically cued overt production of the vowel /a/ and the polysyllabic utterance /pataka/. In younger and older individuals, overt speech production was associated with the activation of a widespread articulo-phonological network, including the primary motor cortex, the supplementary motor area, the cingulate motor areas, and the posterior superior temporal cortex, similar in the /a/ and /pataka/ condition. An analysis of variance with the factors age and condition revealed a significant main effect of age. Irrespective of the experimental condition, significantly greater activation was found in the bilateral posterior superior temporal cortex, the posterior temporal plane, and the transverse temporal gyri in younger compared to older individuals. Significantly greater activation was found in the bilateral middle temporal gyri, medial frontal gyri, middle frontal gyri, and inferior frontal gyri in older vs. younger individuals. The analysis of variance did not reveal a significant main effect of condition and no significant interaction of age and condition. These results suggest a complex reorganization of neural networks dedicated to the production of speech during healthy aging.
Resumo:
The ever increasing demand for high image quality requires fast and efficient methods for noise reduction. The best-known order-statistics filter is the median filter. A method is presented to calculate the median on a set of N W-bit integers in W/B time steps. Blocks containing B-bit slices are used to find B-bits of the median; using a novel quantum-like representation allowing the median to be computed in an accelerated manner compared to the best-known method (W time steps). The general method allows a variety of designs to be synthesised systematically. A further novel architecture to calculate the median for a moving set of N integers is also discussed.
Resumo:
The core processing step of the noise reduction median filter technique is to find the median within a window of integers. A four-step procedure method to compute the running median of the last N W-bit stream of integers showing area and time benefits is proposed. The method slices integers into groups of B-bit using a pipeline of W/B blocks. From the method, an architecture is developed giving a designer the flexibility to exchange area gains for faster frequency of operation, or vice versa, by adjusting N, W and B parameter values. Gains in area of around 40%, or in frequency of operation of around 20%, are clearly observed by FPGA circuit implementations compared to latest methods in the literature.
Resumo:
For Wiener spaces conditional expectations and $L^{2}$-martingales w.r.t. the natural filtration have a natural representation in terms of chaos expansion. In this note an extension to larger classes of processes is discussed. In particular, it is pointed out that orthogonality of the chaos expansion is not required.
Resumo:
The authors examined whether background noise can be habituated to in the laboratory by using memory for prose tasks in 3 experiments. Experiment 1 showed that background speech can be habituated to after 20 min exposure and that meaning and repetition had no effect on the degree of habituation seen. Experiment 2 showed that office noise without speech can also be habituated to. Finally, Experiment 3 showed that a 5-min period of quiet, but not a change in voice, was sufficient to partially restore the disruptive effects of the background noise previously habituated to. These results are interpreted in light of current theories regarding the effects of background noise and habituation; practical implications for office planning are discussed.
Resumo:
Perceptual effects of room reverberation on a "sir" or "stir" test-word can be observed when the level of reverberation in the word is increased, while the reverberation in a surrounding 'context I utterance remains at a minimal level. The result is that listeners make more "sit" identifications. When the context's reverberation is also increased, to approach the level in the test word, extrinsic perceptual compensation is observed, so that the number of listeners' "sir" identifications reduces to a value similar to that found with minimal reverberation. Thus far, compensation effects have only been observed with speech or speech-like contexts in which the short-term spectrum changes as the speaker's articulators move. The results reported here show that some noise contexts with static short-term spectra can also give rise to compensation. From these experiments it would appear that compensation requires a context with a temporal envelope that fluctuates to some extent, so that parts of it resemble offsets. These findings are consistent with a rather general kind of perceptual compensation mechanism; one that is informed by the 'tails' that reverberation adds at offsets. Other results reported here show that narrow-band contexts do not bring about compensation, even when their temporal-envelopes are the same as those of the more effective wideband contexts. These results suggest that compensation is confined to the frequency range occupied by the context, and that in a wideband sound it might operate in a 'band by band' manner.
Resumo:
This paper presents a controller design scheme for a priori unknown non-linear dynamical processes that are identified via an operating point neurofuzzy system from process data. Based on a neurofuzzy design and model construction algorithm (NeuDec) for a non-linear dynamical process, a neurofuzzy state-space model of controllable form is initially constructed. The control scheme based on closed-loop pole assignment is then utilized to ensure the time invariance and linearization of the state equations so that the system stability can be guaranteed under some mild assumptions, even in the presence of modelling error. The proposed approach requires a known state vector for the application of pole assignment state feedback. For this purpose, a generalized Kalman filtering algorithm with coloured noise is developed on the basis of the neurofuzzy state-space model to obtain an optimal state vector estimation. The derived controller is applied in typical output tracking problems by minimizing the tracking error. Simulation examples are included to demonstrate the operation and effectiveness of the new approach.
Resumo:
Sirens used by police, fire and paramedic vehicles have been designed so that they can be heard over large distances, but unfortunately the siren noise enters the vehicle and corrupts intelligibility of voice communications from the emergency vehicle to the control room. Often the siren needs to be turned off to enable the control room to hear what is being said. This paper discusses a siren noise filter system that is capable of removing the siren noise picked up by the two-way radio microphone inside the vehicle. The removal of the siren noise improves the response time for emergency vehicles and thus save lives. To date, the system has been trialed within a fire tender in a non-emergency situation, with good results.
Resumo:
When speech is in competition with interfering sources in rooms, monaural indicators of intelligibility fail to take account of the listener’s abilities to separate target speech from interfering sounds using the binaural system. In order to incorporate these segregation abilities and their susceptibility to reverberation, Lavandier and Culling [J. Acoust. Soc. Am. 127, 387–399 (2010)] proposed a model which combines effects of better-ear listening and binaural unmasking. A computationally efficient version of this model is evaluated here under more realistic conditions that include head shadow, multiple stationary noise sources, and real-room acoustics. Three experiments are presented in which speech reception thresholds were measured in the presence of one to three interferers using real-room listening over headphones, simulated by convolving anechoic stimuli with binaural room impulse-responses measured with dummy-head transducers in five rooms. Without fitting any parameter of the model, there was close correspondence between measured and predicted differences in threshold across all tested conditions. The model’s components of better-ear listening and binaural unmasking were validated both in isolation and in combination. The computational efficiency of this prediction method allows the generation of complex “intelligibility maps” from room designs. © 2012 Acoustical Society of America
Resumo:
Background and aims: In addition to the well-known linguistic processing impairments in aphasia, oro-motor skills and articulatory implementation of speech segments are reported to be compromised to some degree in most types of aphasia. This study aimed to identify differences in the characteristics and coordination of lip movements in the production of a bilabial closure gesture between speech-like and nonspeech tasks in individuals with aphasia and healthy control subjects. Method and procedure: Upper and lower lip movement data were collected for a speech-like and a nonspeech task using an AG 100 EMMA system from five individuals with aphasia and five age and gender matched control subjects. Each task was produced at two rate conditions (normal and fast), and in a familiar and a less-familiar manner. Single articulator kinematic parameters (peak velocity, amplitude, duration, and cyclic spatio-temporal index) and multi-articulator coordination indices (average relative phase and variability of relative phase) were measured to characterize lip movements. Outcome and results: The results showed that when the two lips had similar task goals (bilabial closure) in speech-like versus nonspeech task, kinematic and coordination characteristics were not found to be different. However, when changes in rate were imposed on the bilabial gesture, only speech-like task showed functional adaptations, indicated by a greater decrease in amplitude and duration at fast rates. In terms of group differences, individuals with aphasia showed smaller amplitudes and longer movement durations for upper lip, higher spatio-temporal variability for both lips, and higher variability in lip coordination than the control speakers. Rate was an important factor in distinguishing the two groups, and individuals with aphasia were limited in implementing the rate changes. Conclusion and implications: The findings support the notion of subtle but robust differences in motor control characteristics between individuals with aphasia and the control participants, even in the context of producing bilabial closing gestures for a relatively simple speech-like task. The findings also highlight the functional differences between speech-like and nonspeech tasks, despite a common movement coordination goal for bilabial closure.