76 resultados para Clipping Noise
Resumo:
Temporal dynamics and speaker characteristics are two important features of speech that distinguish speech from noise. In this paper, we propose a method to maximally extract these two features of speech for speech enhancement. We demonstrate that this can reduce the requirement for prior information about the noise, which can be difficult to estimate for fast-varying noise. Given noisy speech, the new approach estimates clean speech by recognizing long segments of the clean speech as whole units. In the recognition, clean speech sentences, taken from a speech corpus, are used as examples. Matching segments are identified between the noisy sentence and the corpus sentences. The estimate is formed by using the longest matching segments found in the corpus sentences. Longer speech segments as whole units contain more distinct dynamics and richer speaker characteristics, and can be identified more accurately from noise than shorter speech segments. Therefore, estimation based on the longest recognized segments increases the noise immunity and hence the estimation accuracy. The new approach consists of a statistical model to represent up to sentence-long temporal dynamics in the corpus speech, and an algorithm to identify the longest matching segments between the noisy sentence and the corpus sentences. The algorithm is made more robust to noise uncertainty by introducing missing-feature based noise compensation into the corpus sentences. Experiments have been conducted on the TIMIT database for speech enhancement from various types of nonstationary noise including song, music, and crosstalk speech. The new approach has shown improved performance over conventional enhancement algorithms in both objective and subjective evaluations.
Resumo:
Environmental Psychology has typically considered noise as pollution and focused upon its negative impact. However, recent research in psychology and anthropology indicates the experience of noise as aversive depends upon the meanings with which it is attributed. Moreover, such meanings seem to be dependent on the social context. Here we extend this research through studying the aural experience of a religious festival in North India which is characterised by loud, continuous and cacophonous noise. Reporting an experiment and semi-structured interviews, we show that loud noise is experienced as pleasant or unpleasant according to the meanings attributed to it. Specifically, the experiment shows the same noise is experienced more positively (and listened to longer) when attributed to the festival rather than to a non-festival source. In turn, the qualitative data show that within the Mela, noises judged as having a religious quality are reported as more positive than noises that are not. Moreover, the qualitative data suggest a key factor in the evaluation of noise is our participants’ social identities as pilgrims. This identity provides a framework for interpreting the auditory environment and noises judged as intruding into their religious experience were judged negatively, whereas noises judged as contributing to their religious experience were judged more positively. Our findings therefore point to the ways in which our social identities are implicated in the process of attributing meaning to the auditory environment.
Resumo:
Before a natural sound can be recognized, an auditory signature of its source must be learned through experience. Here we used random waveforms to probe the formation of new memories for arbitrary complex sounds. A behavioral measure was designed, based on the detection of repetitions embedded in noises up to 4 s long. Unbeknownst to listeners, some noise samples reoccurred randomly throughout an experimental block. Results showed that repeated exposure induced learning for otherwise totally unpredictable and meaningless sounds. The learning was unsupervised and resilient to interference from other task-relevant noises. When memories were formed, they emerged rapidly, performance became abruptly near-perfect, and multiple noises were remembered for several weeks. The acoustic transformations to which recall was tolerant suggest that the learned features were local in time. We propose that rapid sensory plasticity could explain how the auditory brain creates useful memories from the ever-changing, but sometimes repeating, acoustical world. © 2010 Elsevier Inc.
Resumo:
Three experiments measured the effects of age on informational masking of speech by competing speech. The experiments were designed to minimize the energetic contributions of the competing speech so that informational masking could be measured with no large corrections for energetic masking. Experiment 1 used a "speech-in-speech-in-noise" design, in which the competing speech was presented in noise at a signal-to-noise ratio (SNR) of -4 dB. This ensured that the noise primarily contributed the energetic masking but the competing speech contributed the informational masking. Equal amounts of informational masking (3 dB) were observed for young and elderly listeners, although less was found for hearing-impaired listeners. Experiment 2 tested a range of SNRs in this design and showed that informational masking increased with SNR up to about an SNR of -4 dB, but decreased thereafter. Experiment 3 further reduced the energetic contribution of the competing speech by filtering it into different frequency bands from the target speech. The elderly listeners again showed approximately the same amount of informational masking (4-5 dB), although some elderly listeners had particular difficulty understanding these stimuli in any condition. On the whole, these results suggest that young and elderly listeners were equally susceptible to informational masking. © 2009 Acoustical Society of America.
Resumo:
In noise repetition-detection tasks, listeners have to distinguish trials of continuously running noise from trials in which noise tokens are repeated in a cyclic manner. Recently, it has been shown that using the exact same noise token across several trials (“reference noise”) facilitates the detection of repetitions for this token [Agus et al. (2010). Neuron 66, 610–618]. This was attributed to perceptual learning. Here, the nature of the learning was investigated. In experiment 1, reference noise tokens were embedded in trials with or without cyclic presentation. Naïve listeners reported repetitions in both cases, thus responding to the reference noise even in the absence of an actual repetition. Experiment 2, with the same listeners, showed a similar pattern of results even after the design of the experiment was made explicit, ruling out a misunderstanding of the task. Finally, in experiment 3, listeners reported repetitions in trials containing the reference noise, even before ever hearing it presented cyclically. The results show that listeners were able to learn and recognize noise tokens in the absence of an immediate repetition. Moreover, the learning mandatorily interfered with listeners' ability to detect repetitions. It is concluded that salient perceptual changes accompany the learning of noise.
Resumo:
Performance at the International Computer Music Conference, University of Huddersfield (with Eric Lyon, Franziska Schroder & Steve Davis).
Resumo:
Within Ireland, interest in strategically supporting young people’s participation in the arts has increased. Additionally, awareness of the Internet’s potential for promot- ing engagement with the arts has grown. Addressing national directives and local needs assessments, South Dublin County Council’s Arts Office initiated NOISE South Dublin (http://www.noisesouthdublin.com), an interactive Web site based on Australia Council’s NOISE project (http://www.noise.net), to promote the creative development of young people in the county. This article presents the practical chal- lenges and potential of youth arts Web-based programs for harnessing the creative engagement of youth. It concludes that the Internet is only useful if it expands online engagement offline.
Resumo:
To obtain cm/s precision, stellar surface magneto-convection must be disentangled from observed radial velocities (RVs). In order to understand and remove the convective signature, we create Sun-as-a-star model observations based on a 3D magnetohydrodynamic solar simulation. From these Sun-as-a-star model observations, we find several line characteristics are correlated with the induced RV shifts. The aim of this campaign is to feed directly into future high precision RV studies, such as the search for habitable, rocky worlds, with forthcoming spectrographs such as ESPRESSO.
Resumo:
The implementation of a dipole antenna co-designed and monolithically integrated with a low noise amplifier (LNA) on low resistivity Si substrate (20 Omega . cm) manufactured in 0.35 mu m commercial SiGe HBT process with f(T)/f(max) of 170 GHz and 250 GHz is investigated theoretically and experimentally. An air gap is introduced between the chip and a reflective ground plane, leading to substantial improvements in efficiency and gain. Moreover, conjugate matching conditions between the antenna and the LNA are exploited, enhancing power transfer between without any additional matching circuit. A prototype is fabricated and tested to validate the performance. The measured 10-dB gain of the standalone LNA is centered at 58 GHz with a die size of 0.7 mm x 0.6 mm including all pads. The simulated results showed antenna directivity of 5.1 dBi with efficiency higher than 70%. After optimization, the co-designed LNA-Antenna chip with a die size of 3 mm x 2.8 mm was characterized in anechoic chamber environment. A maximum gain of higher than 12 dB was obtained.
Resumo:
This paper presents a new approach to speech enhancement from single-channel measurements involving both noise and channel distortion (i.e., convolutional noise), and demonstrates its applications for robust speech recognition and for improving noisy speech quality. The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise for speech estimation. Third, we present an iterative algorithm which updates the noise and channel estimates of the corpus data model. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement.
Resumo:
This paper presents a new approach to single-channel speech enhancement involving both noise and channel distortion (i.e., convolutional noise). The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise. Third, we present an iterative algorithm for improved speech estimates. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement. Index Terms: corpus-based speech model, longest matching segment, speech enhancement, speech recognition
Resumo:
It is shown that under certain conditions it is possible to obtain a good speech estimate from noise without requiring noise estimation. We study an implementation of the theory, namely wide matching, for speech enhancement. The new approach performs sentence-wide joint speech segment estimation subject to maximum recognizability to gain noise robustness. Experiments have been conducted to evaluate the new approach with variable noises and SNRs from -5 dB to noise free. It is shown that the new approach, without any estimation of the noise, significantly outperformed conventional methods in the low SNR conditions while retaining comparable performance in the high SNR conditions. It is further suggested that the wide matching and deep learning approaches can be combined towards a highly robust and accurate speech estimator.