Biblioteca Digital

54 resultados para wearable audio

Online audio background determination for complex audio environments

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a method for foreground/background separation of audio using a background modelling technique. The technique models the background in an online, unsupervised, and adaptive fashion, and is designed for application to long term surveillance and monitoring problems. The background is determined using a statistical method to model the states of the audio over time. In addition, three methods are used to increase the accuracy of background modelling in complex audio environments. Such environments can cause the failure of the statistical model to accurately capture the background states. An entropy-based approach is used to unify background representations fragmented over multiple states of the statistical model. The approach successfully unifies such background states, resulting in a more robust background model. We adaptively adjust the number of states considered background according to background complexity, resulting in the more accurate classification of background models. Finally, we use an auxiliary model cache to retain potential background states in the system. This prevents the deletion of such states due to a rapid influx of observed states that can occur for highly dynamic sections of the audio signal. The separation algorithm was successfully applied to a number of audio environments representing monitoring applications.

Distributed audio network for speech enhancement in challenging noise backgrounds

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new approach to enhance speech based on a distributed microphone network. Each microphone is used to simultaneously classify the input into either one of the noise types or as speech. For enhancing the speech signal a modified spectral subtraction approach is used that utilise the sound information of the entire network to update the noise model even during speech. This improves the reduction of the ambient noise, especially for non-stationary noise types such as street or beach noise. Experiments demonstrate the effectiveness of the proposed system.

Calibration of audio-video sensors for multi-modal event indexing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the coordinated use of video and audio cues to capture and index surveillance events with multimodal labels. The focus of this paper is the development of a joint-sensor calibration technique that uses audio-visual observations to improve the calibration process. One significant feature of this approach is the ability to continuously check and update the calibration status of the sensor suite, making it resilient to independent drift in the individual sensors. We present scenarios in which this system is used to enhance surveillance.

Unifying background models over complex audio using entropy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we extend an existing audio background modelling technique, leading to a more robust application to complex audio environments. The determination of background audio is used as an initial stage in the analysis of audio for surveillance and monitoring applications. Knowledge of the background serves to highlight unusual or infrequent sounds. An existing modelling approach uses an online, adaptive Gaussian Mixture model technique that uses multiple distributions to model variations in the background. The method used to determine the background distributions of the GMM leads to a failure mode of the existing technique when applied to complex audio. We propose a method incorporating further information, the proximity of distributions determined using entropy, to determine a more complete background model. The method was successful in more robustly modelling the background for complex audio scenes.

Narrative structure detection through audio pace

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use the concept of film pace, expressed through the audio, to analyse the broad level narrative structure of film. The narrative structure is divided into visual narration, action sections, and audio narration, plot development sections. We hypothesise, that changes in the narrative structure signal a change in audio content, which is reflected by a change in audio pace. We test this hypothesis using a number of audio feature functions, that reflect the audio pace, to detect changes in narrative structure for 8 films of varying genres. The properties of the energy were then used to determine the. audio pace feature corresponding to the narrative, structure for each film analysed. The method was successful in determining the narrative structure for 1 of the films, achieving an overall precision of 76.4% and recall of 80.3%, We map the properties of the speech and energy of film audio to the higher level semantic concept of audio pace. The audio pace was in turn applied to a higher level semantic analysis of the structure of film.

Incorporating contextual audio for an actively anxious smart home

Relevância:

20.00% 20.00%

Publicador:

Persistent audio modelling for background determination

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with modelling background audio online to detect foreground sounds in complex audio environments for surveillance and smart home applications. We examine and expand upon previous work in the audio and video domains, and propose a new implementation of an audio background modelling algorithm, addressing the complexities of audio data. A number of audio features characterising different aspects of the audio content were analysed to determine the factors relevant to the determination of the background audio. We test the algorithms on three audio data sets of varying complexity. The new approach was successful in modelling the background audio for the test data.

Horror film genre typing and scene labeling via audio analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We examine localised sound energy patterns, or events, that we associate with high level affect experienced with films. The study of sound energy events in conjunction with their intended affect enable the analysis of film at a higher conceptual level, such as genre. The various affect/emotional responses we investigate in this paper are brought about by well established patterns of sound energy dynamics employed in audio tracks of horror films. This allows the examination of the thematic content of the films in relation to horror elements. We analyse the frequency of sound energy and affect events at a film level as well as at a scene level, and propose measures indicative of the film genre and scene content. Using 4 horror, and 2 non-horror movies as experimental data we establish a correlation between the sound energy event types and horrific thematic content within film, thus enabling an automated mechanism for genre typing and scene content labeling in film.

Video genre categorization using audio wavelet coefficients

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we investigate the use of a wavelet transform-based analysis of audio tracks accompanying videos for the problem of automatic program genre detection. We compare the classification performance based on wavelet-based audio features to that using conventional features derived from Fourier and time analysis for the task of discriminating TV programs such as news, commercials, music shows, concerts, motor racing games, and animated cartoons. Three different classifiers namely the Decision Trees, SVMs, and k-Nearest Neighbours are studied to analyse the reliability of the performance of our wavelet features based approach. Further, we investigate the issue of an appropriate duration of an audio clip to be analyzed for this automatic genre determination. Our experimental results show that features derived from the wavelet transform of the audio signal can very well separate the six video genres studied. It is also found that there is no significant difference in performance with varying audio clip durations across the classifiers.

Automatic isolation of speech in news footage for audio event detection

Relevância:

20.00% 20.00%

Publicador:

Detecting indexical signs in film audio for scene interpretation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we study the sound tracks in films and their indexical semiotic usage by developing a classification system that detects complex sound scenes and their constituent sound events in cinema. We investigate two main issues in this paper: Determination of what constitutes the presence of a high level sound scene and inferences about the thematic content of the scene that can be drawn from this presence, and classification of environmental sounds in the audio track of the scene, to assist in the automatic detection of the high level scene. Experiments with our classification system on pure sounds resulted in a correct event classification rate of 88.9%. When the audio content of a number of film scenes was examined, though a lower accuracy resulted with sound event detection due to the presence of mixed sounds, the film audio samples were generally classified with the correct high-level sound scene label, enabling correct inferences about the story content of the scenes.

Determining affective events through film audio

Relevância:

20.00% 20.00%

Publicador:

Robust patchwork-based embedding and decoding scheme for digital audio watermarking

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel patchwork-based embedding and decoding scheme for digital audio watermarking. At the embedding stage, an audio segment is divided into two subsegments and the discrete cosine transform (DCT) coefficients of the subsegments are computed. The DCT coefficients related to a specified frequency region are then partitioned into a number of frame pairs. The DCT frame pairs suitable for watermark embedding are chosen by a selection criterion and watermarks are embedded into the selected DCT frame pairs by modifying their coefficients, controlled by a secret key. The modifications are conducted in such a way that the selection criterion used at the embedding stage can be applied at the decoding stage to identify the watermarked DCT frame pairs. At the decoding stage, the secret key is utilized to extract watermarks from the watermarked DCT frame pairs. Compared with existing patchwork watermarking methods, the proposed scheme does not require information of which frame pairs of the watermarked audio signal enclose watermarks and is more robust to conventional attacks.

Novel watermarking methods for copyright protection of audio signals

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research presented improved watermarking methods for mono and stereo audio signals. To enhance the performance, novel methods are developed using echo hiding techniques and patchwork-based algorithms. The superior performances of the proposed methods are demonstrated by theoretical analysis and simulation examples, in comparison with the existing methods.

Quantification of tackling demands in elite Australian football using integrated wearable athlete technology

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
»