999 resultados para movie audio tracks


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we investigate the problem of classifying a subset of environmental sounds in movie audio tracks that indicate specific indexical semiotic use. These environmental sounds are used to signify and enhance events occurring in film scenes. We propose a classification system for detecting the presence of violence and car chase scenes in film by classifying ten various environmental sounds that form the constituent audio events of these scenes using a number of old and new audio features. Experiments with our classification system on pure test sounds resulted in a correct event classification rate of 88.9%. We also present the results of the classifier on the mixed audio tracks of several scenes taken from The Mummy and Lethal Weapon 2. The classification of sound events is the first step towards determining the presence of the complex sound scenes within film audio and describing the thematic content of the scenes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We examine localised sound energy patterns, or events, that we associate with high level affect experienced with films. The study of sound energy events in conjunction with their intended affect enable the analysis of film at a higher conceptual level, such as genre. The various affect/emotional responses we investigate in this paper are brought about by well established patterns of sound energy dynamics employed in audio tracks of horror films. This allows the examination of the thematic content of the films in relation to horror elements. We analyse the frequency of sound energy and affect events at a film level as well as at a scene level, and propose measures indicative of the film genre and scene content. Using 4 horror, and 2 non-horror movies as experimental data we establish a correlation between the sound energy event types and horrific thematic content within film, thus enabling an automated mechanism for genre typing and scene content labeling in film.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we investigate the use of a wavelet transform-based analysis of audio tracks accompanying videos for the problem of automatic program genre detection. We compare the classification performance based on wavelet-based audio features to that using conventional features derived from Fourier and time analysis for the task of discriminating TV programs such as news, commercials, music shows, concerts, motor racing games, and animated cartoons. Three different classifiers namely the Decision Trees, SVMs, and k-Nearest Neighbours are studied to analyse the reliability of the performance of our wavelet features based approach. Further, we investigate the issue of an appropriate duration of an audio clip to be analyzed for this automatic genre determination. Our experimental results show that features derived from the wavelet transform of the audio signal can very well separate the six video genres studied. It is also found that there is no significant difference in performance with varying audio clip durations across the classifiers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Introduction: Clinical medical education is increasingly utilizing novel technological approaches in order to supplement traditional lecture-based didactics. The Neurology Core Clerkship at Baylor College of Medicine is a four week required course taken by clinical medical students. Given the large amount of information to be disseminated in a short period of time, part of the didactic material has been provided online in the form of narrated PowerPoint files or lecture audio tracks along with stand-alone PowerPoint files. The narrated files are generated using the native PowerPoint narration function while the stand-alone audio files are created as MP3 format files using an inexpensive digital recording device. [See PDF for complete abstract]

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis explores the role of multimodality in language learners’ comprehension, and more specifically, the effects on students’ audio-visual comprehension when different orchestrations of modes appear in the visualization of vodcasts. Firstly, I describe the state of the art of its three main areas of concern, namely the evolution of meaning-making, Information and Communication Technology (ICT), and audio-visual comprehension. One of the most important contributions in the theoretical overview is the suggested integrative model of audio-visual comprehension, which attempts to explain how students process information received from different inputs. Secondly, I present a study based on the following research questions: ‘Which modes are orchestrated throughout the vodcasts?’, ‘Are there any multimodal ensembles that are more beneficial for students’ audio-visual comprehension?’, and ‘What are the students’ attitudes towards audio-visual (e.g., vodcasts) compared to traditional audio (e.g., audio tracks) comprehension activities?’. Along with these research questions, I have formulated two hypotheses: Audio-visual comprehension improves when there is a greater number of orchestrated modes, and students have a more positive attitude towards vodcasts than traditional audios when carrying out comprehension activities. The study includes a multimodal discourse analysis, audio-visual comprehension tests, and students’ questionnaires. The multimodal discourse analysis of two British Council’s language learning vodcasts, entitled English is GREAT and Camden Fashion, using ELAN as the multimodal annotation tool, shows that there are a variety of multimodal ensembles of two, three and four modes. The audio-visual comprehension tests were given to 40 Spanish students, learning English as a foreign language, after the visualization of vodcasts. These comprehension tests contain questions related to specific orchestrations of modes appearing in the vodcasts. The statistical analysis of the test results, using repeated-measures ANOVA, reveal that students obtain better audio-visual comprehension results when the multimodal ensembles are constituted by a greater number of orchestrated modes. Finally, the data compiled from the questionnaires, conclude that students have a more positive attitude towards vodcasts in comparison to traditional audio listenings. Results from the audio-visual comprehension tests and questionnaires prove the two hypotheses of this study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La version intégrale de cette thèse est disponible uniquement pour consultation individuelle à la Bibliothèque de musique de l’Université de Montréal (http://www.bib.umontreal.ca/MU).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We develop an algorithm for the detection and classification of affective sound events underscored by specific patterns of sound energy dynamics. We relate the portrayal of these events to proposed high level affect or emotional coloring of the events. In this paper, four possible characteristic sound energy events are identified that convey well established meanings through their dynamics to portray and deliver certain affect, sentiment related to the horror film genre. Our algorithm is developed with the ultimate aim of automatically structuring sections of films that contain distinct shades of emotion related to horror themes for nonlinear media access and navigation. An average of 82% of the energy events, obtained from the analysis of the audio tracks of sections of four sample films corresponded correctly to the proposed affect. While the discrimination between certain sound energy event types was low, the algorithm correctly detected 71% of the occurrences of the sound energy events within audio tracks of the films analyzed, and thus forms a useful basis for determining affective scenes characteristic of horror in movies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this article I introduce the term ‘theatrical latency’ as a pleasurable effect experienced when listening to sound in relation to visual perception. Latency refers to both the phenomena of audio delay (in feedback from analogue to digital conversion and the momentary lapses experienced when playing live with recorded music) and a theatrical sensation that comes from the reanimation of visual environments through aural framing. In this configuration, the notion of latency takes on a double meaning as both a recorded phenomenon and the retrieval of something dormant within physical objects, sites or materials. These ideas will be introduced through my experience of walking Katrina Palmer’s site-specific audio work The Loss Adjusters (2015) on the island of Portland (UK). The audio tracks create an extended meditation on Portland, interweaving specific locations and histories with fictional characters and ghosts of the island.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is impracticable to upgrade the 18,900 Australian passive crossings as such crossings are often located in remote areas, where power is lacking and with low road and rail traffic. The rail industry is interested in developing innovative in-vehicle technology interventions to warn motorists of approaching trains directly in their vehicles. The objective of this study was therefore to evaluate the benefits of the introduction of such technology. We evaluated the changes in driver performance once the technology is enabled and functioning correctly, as well as the effects of an unsafe failure of the technology? We conducted a driving simulator study where participants (N=15) were familiarised with an in-vehicle audio warning for an extended period. After being familiarised with the system, the technology started failing, and we tested the reaction of drivers with a train approaching. This study has shown that with the traditional passive crossings with RX2 signage, the majority of drivers complied (70%) and looked for trains on both sides of the rail track. With the introduction of the in-vehicle audio message, drivers did not approach crossings faster, did not reduce their safety margins and did not reduce their gaze towards the rail tracks. However participants’ compliance at the stop sign decreased by 16.5% with the technology installed in the vehicle. The effect of the failure of the in-vehicle audio warning technology showed that most participants did not experience difficulties in detecting the approaching train even though they did not receive any warning message. This showed that participants were still actively looking for trains with the system in their vehicle. However, two participants did not stop and one decided to beat the train when they did not receive the audio message, suggesting potential human factors issues to be considered with such technology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automated digital recordings are useful for large-scale temporal and spatial environmental monitoring. An important research effort has been the automated classification of calling bird species. In this paper we examine a related task, retrieval of birdcalls from a database of audio recordings, similar to a user supplied query call. Such a retrieval task can sometimes be more useful than an automated classifier. We compare three approaches to similarity-based birdcall retrieval using spectral ridge features and two kinds of gradient features, structure tensor and the histogram of oriented gradients. The retrieval accuracy of our spectral ridge method is 94% compared to 82% for the structure tensor method and 90% for the histogram of gradients method. Additionally, this approach potentially offers a more compact representation and is more computationally efficient.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Any automatically measurable, robust and distinctive physical characteristic or personal trait that can be used to identify an individual or verify the claimed identity of an individual, referred to as biometrics, has gained significant interest in the wake of heightened concerns about security and rapid advancements in networking, communication and mobility. Multimodal biometrics is expected to be ultra-secure and reliable, due to the presence of multiple and independent—verification clues. In this study, a multimodal biometric system utilising audio and facial signatures has been implemented and error analysis has been carried out. A total of one thousand face images and 250 sound tracks of 50 users are used for training the proposed system. To account for the attempts of the unregistered signatures data of 25 new users are tested. The short term spectral features were extracted from the sound data and Vector Quantization was done using K-means algorithm. Face images are identified based on Eigen face approach using Principal Component Analysis. The success rate of multimodal system using speech and face is higher when compared to individual unimodal recognition systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

I denna uppsats har filmljudet i krigsfilmerna Apocalypse Now och Saving Private Ryan undersökts. Detta har gjorts för att försöka bidra med ökad förståelse för filmljudets användningsområde och funktioner, främst för filmerna i fråga, men även för krigsfilm rent generellt. Filmljud i denna kontext omfattar allt det ljud som finns i film, men utesluter dock all ickediegetisk musik. Båda filmerna har undersökts genom en audio-visuell analys. En sådan analys görs genom att detaljgranska båda filmernas ljud- och bildinnehåll var för sig, för att slutligen undersöka samma filmsekvens som helhet då ljudet och bilden satts ihop igen. Den audio-visuella analysmetod som nyttjats i uppsatsen är Michel Chions metod, Masking. De 30 minuter film som analyserades placerades sedan i olika filmljudzoner, där respektive filmljudzons ljudinnehåll bland annat visade vilka främsta huvudfunktioner somfilmljudet hade i dessa filmer. Dessa funktioner är till för att bibehålla åskådarens fokus och intresse, att skapa närhet till rollkaraktärerna, samt att tillföra en hög känsla av realism och närvaro. Intentionerna med filmljudet verkade vara att flytta åskådaren in i filmens verklighet, att låta åskådaren bli ett med filmen. Att återspegla denna känsla av realism, närvaro, fokus samt intresse, visade sig också vara de intentioner som funnits redan i de båda filmernas förproduktionsstadier. Detta bevisar att de lyckats åstadkomma det de eftersträvat. Men om filmljudet använts på samma sätt eller innehar samma funktioner i krigsfilm rent genrellt går inte att säga.I have for this bachelor’s thesis examined the movie sound of the classic warfare movies Apocalypse Now and Saving Private Ryan. This is an attempt to contribute to a more profound comprehension of the appliance and importance of movie sound. In this context movie sound implies all kinds of sounds within the movies, accept from non-diegetic music. These two movies have been examined by an audio-visual analysis. It's done by auditing the sound and picture content separately, and then combined to audit the same sequence as a whole. Michel Chion, which is the founder of this analysis, calls this method Masking. The sound in this 30 minute sequence was then divided into different zones, where every zone represented a certain main function. These functions are provided to create a stronger connection to the characters, sustain the viewers interest and bring a sense of realism and presence. It seems though the intention with the movies sound is to bring the viewers to the scene in hand, and let it become their reality. To mirror this sense of realism, presence, focus and interest, proves to be the intention from an early stage of the production. This bachelor’s thesis demonstrates a success in their endeavours. Although it can’t confirm whether the movie sound have been utilized in the same manner or if they posess the same functions to warefare movies in general.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we study the sound tracks in films and their indexical semiotic usage by developing a classification system that detects complex sound scenes and their constituent sound events in cinema. We investigate two main issues in this paper: Determination of what constitutes the presence of a high level sound scene and inferences about the thematic content of the scene that can be drawn from this presence, and classification of environmental sounds in the audio track of the scene, to assist in the automatic detection of the high level scene. Experiments with our classification system on pure sounds resulted in a correct event classification rate of 88.9%. When the audio content of a number of film scenes was examined, though a lower accuracy resulted with sound event detection due to the presence of mixed sounds, the film audio samples were generally classified with the correct high-level sound scene label, enabling correct inferences about the story content of the scenes.