997 resultados para audio processing


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new interpolation technique has been developed for replacing missing samples in a sampled waveform drawn from a stationary stochastic process, given the power spectrum for the process. The method works with a finite block of data and is based on the assumption that components of the block DFT are Gaussian zero-mean independent random variables with variance proportional to the power spectrum at each frequency value. These assumptions make the interpolator particularly suitable for signals with a sharply-defined harmonic structure, such as audio waveforms recorded from music or voiced speech. Some results are presented and comparisons are made with existing techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Audio scrambling can be employed to ensure confidentiality in audio distribution. We first describe scrambling for raw audio using the discrete wavelet transform (DWT) first and then focus on MP3 audio scrambling. We perform scrambling based on a set of keys which allows for a set of audio outputs having different qualities. During descrambling, the number of keys provided and the number of rounds of descrambling performed will decide the audio output quality. We also perform scrambling by using multiple keys on the MP3 audio format. With a subset of keys, we can descramble to obtain a low quality audio. However, we can obtain the original quality audio by using all of the keys. Our experiments show that the proposed algorithms are effective, fast, simple to implement while providing flexible control over the progressive quality of the audio output. The security level provided by the scheme is sufficient for protecting MP3 music content.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Audio/Visual Emotion Challenge and Workshop (AVEC 2011) is the first competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and audiovisual emotion analysis, with all participants competing under strictly the same conditions. This paper first describes the challenge participation conditions. Next follows the data used – the SEMAINE corpus – and its partitioning into train, development, and test partitions for the challenge with labelling in four dimensions, namely activity, expectation, power, and valence. Further, audio and video baseline features are introduced as well as baseline results that use these features for the three sub-challenges of audio, video, and audiovisual emotion recognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to its efficiency and simplicity, the finite-difference time-domain method is becoming a popular choice for solving wideband, transient problems in various fields of acoustics. So far, the issue of extracting a binaural response from finite difference simulations has only been discussed in the context of embedding a listener geometry in the grid. In this paper, we propose and study a method for binaural response rendering based on a spatial decomposition of the sound field. The finite difference grid is locally sampled using a volumetric array of receivers, from which a plane wave density function is computed and integrated with free-field head related transfer functions, in the spherical harmonics domain. The volumetric array is studied in terms of numerical robustness and spatial aliasing. Analytic formulas that predict the performance of the array are developed, facilitating spatial resolution analysis and numerical binaural response analysis for a number of finite difference schemes. Particular emphasis is placed on the effects of numerical dispersion on array processing and on the resulting binaural responses. Our method is compared to a binaural simulation based on the image method. Results indicate good spatial and temporal agreement between the two methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many discussions about the music processing have occurred over the years. It is stated, on one hand, the existence of a single joint for grasping the music or any of its attributes by the Central Nervous System. Furthermore, it is claimed also the existence of multiple and diverse systems to understand each aspect of music. In general, model-independent set, studies focusing on the processing of sound components, specifically the musical tones, can significantly clarify the basic functioning of the auditory system and other higher brain functions. In this sense, one of the most prominent approaches in the study of sensory and perceptual processes of hearing, or changed unharmed, has been Neuroscience, which is interested in the interaction between the brain areas corresponding to different cognitive processes. Thus, the purpose of this study was to review the studies that dealt processing models of the attributes of tonal Western music, based on the conception that neuropsychological neural structures are interdependent sensory pathways.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In audio watermarking, the robustness against pitch-scaling attack, is one of the most challenging problems. In this paper, we propose an algorithm, based on traditional time-spread(TS) echo hiding based audio watermarking to solve this problem. In TS echo hiding based watermarking, pitch-scaling attack shifts the location of pseudonoise (PN) sequence which appears in the cepstrum domain. Thus, position of the peak, which occurs after correlating with PN-sequence changes by an un-known amount and that causes the error. In the proposed scheme, we replace PN-sequence with unit-sample sequence and modify the decoding algorithm in such a way it will not depend on a particular point in cepstrum domain for extraction of watermark. Moreover proposed algorithm is applied to stereo audio signals to further improve the robustness. Experimental results illustrate the effectiveness of the proposed algorithm against pitch-scaling attacks compared to existing methods. In addition to that proposed algorithm also gives better robustness against other conventional signal processing attacks.