975 resultados para audio-visuel


Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Audio/Visual Emotion Challenge and Workshop (AVEC 2011) is the first competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and audiovisual emotion analysis, with all participants competing under strictly the same conditions. This paper first describes the challenge participation conditions. Next follows the data used – the SEMAINE corpus – and its partitioning into train, development, and test partitions for the challenge with labelling in four dimensions, namely activity, expectation, power, and valence. Further, audio and video baseline features are introduced as well as baseline results that use these features for the three sub-challenges of audio, video, and audiovisual emotion recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent debates about media literacy and the internet have begun to acknowledge the importance of active user-engagement and interaction. It is not enough simply to access material online, but also to comment upon it and re-use. Yet how do these new user expectations fit within digital initiatives which increase access to audio-visual-content but which prioritise access and preservation of archives and online research rather than active user-engagement? This article will address these issues of media literacy in relation to audio-visual content. It will consider how these issues are currently being addressed, focusing particularly on the high-profile European initiative EUscreen. EUscreen brings together 20 European television archives into a single searchable database of over 40,000 digital items. Yet creative re-use restrictions and copyright issues prevent users from re-working the material they find on the site. Instead of re-use, EUscreen instead offers access and detailed contextualisation of its collection of material. But if the emphasis for resources within an online environment rests no longer upon access but on user-engagement, what does EUscreen and similar sites offer to different users?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Human listeners seem to be remarkably able to recognise acoustic sound sources based on timbre cues. Here we describe a psychophysical paradigm to estimate the time it takes to recognise a set of complex sounds differing only in timbre cues: both in terms of the minimum duration of the sounds and the inferred neural processing time. Listeners had to respond to the human voice while ignoring a set of distractors. All sounds were recorded from natural sources over the same pitch range and equalised to the same duration and power. In a first experiment, stimuli were gated in time with a raised-cosine window of variable duration and random onset time. A voice/non-voice (yes/no) task was used. Performance, as measured by d', remained above chance for the shortest sounds tested (2 ms); d's above 1 were observed for durations longer than or equal to 8 ms. Then, we constructed sequences of short sounds presented in rapid succession. Listeners were asked to report the presence of a single voice token that could occur at a random position within the sequence. This method is analogous to the "rapid sequential visual presentation" paradigm (RSVP), which has been used to evaluate neural processing time for images. For 500-ms sequences made of 32-ms and 16-ms sounds, d' remained above chance for presentation rates of up to 30 sounds per second. There was no effect of the pitch relation between successive sounds: identical for all sounds in the sequence or random for each sound. This implies that the task was not determined by streaming or forward masking, as both phenomena would predict better performance for the random pitch condition. Overall, the recognition of familiar sound categories such as the voice seems to be surprisingly fast, both in terms of the acoustic duration required and of the underlying neural time constants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Direct experience of social work in another country is making an increasingly important contribution to internationalising the social work academic curriculum together with the cultural competency of students. However at present this opportunity is still restricted to a limited number of students. The aim of this paper is to describe and reflect on the production of an audio-visual presentation as representing the experience of three students who participated in an exchange with a social work programme in Pune, India. It describes and assesses the rationale, production and use of video to capture student learning from the Belfast/Pune exchange. We also describe the use of the video in a classroom setting with a year group of 53 students from a younger cohort. This exercise aimed to stimulate students’ curiosity about international dimensions of social work and add to their awareness of poverty, social justice, cultural competence and community social work as global issues. Written classroom feedback informs our discussion of the technical as well as the pedagogical benefits and challenges of this approach. We conclude that some benefit of audio-visual presentation in helping students connect with diverse cultural contexts, but that a complementary discussion challenging stereotyped viewpoints and unconscious professional imperialism is also crucial.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experience obtained in the support of mobile learning using podcast audio is reported. The paper outlines design, storage and distribution via a web site. An initial evaluation of the uptake of the approach in a final year computing module was undertaken. Audio objects were tailored to meet different pedagogical needs resulting in a repository of persistent glossary terms and disposable audio lectures distributed by podcasting. An aim of our approach is to document the interest from the students, and evaluate the potential of mobile learning for supplementing revision

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La période postnatale et l’expérience sensorielle sont critiques pour le développement du système visuel. Les interneurones inhibiteurs exprimant l’acide γ-aminobutyrique (GABA) jouent un rôle important dans le contrôle de l’activité neuronale, le raffinement et le traitement de l’information sensorielle qui parvient au cortex cérébral. Durant le développement, lorsque le cortex cérébral est très susceptible aux influences extrinsèques, le GABA agit dans la formation des périodes critiques de sensibilité ainsi que dans la plasticité dépendante de l’expérience. Ainsi, ce système inhibiteur servirait à ajuster le fonctionnement des aires sensorielles primaires selon les conditions spécifiques d’activité en provenance du milieu, des afférences corticales (thalamiques et autres) et de l’expérience sensorielle. Certaines études montrent que des différences dans la densité et la distribution de ces neurones inhibiteurs corticaux reflètent les caractéristiques fonctionnelles distinctes entre les différentes aires corticales. La Parvalbumine (PV), la Calretinine (CR) et la Calbindine (CB) sont des protéines chélatrices du calcium (calcium binding proteins ou CaBPs) localisées dans différentes sous-populations d’interneurones GABAergiques corticaux. Ces protéines tamponnent le calcium intracellulaire de sorte qu’elles peuvent moduler différemment plusieurs fonctions neuronales, notamment l’aspect temporel des potentiels d’action, la transmission synaptique et la potentialisation à long terme. Plusieurs études récentes montrent que les interneurones immunoréactifs (ir) aux CaBPs sont également très sensibles à l’expérience et à l’activité sensorielle durant le développement et chez l’adulte. Ainsi, ces neurones pourraient avoir un rôle crucial à jouer dans le phénomène de compensation ou de plasticité intermodale entre les cortex sensoriels primaires. Chez le hamster (Mesocricetus auratus), l’énucléation à la naissance fait en sorte que le cortex visuel primaire peut être recruté par les autres modalités sensorielles, telles que le toucher et l’audition. Suite à cette privation oculaire, il y a établissement de projections ectopiques permanentes entre les collicules inférieurs (CI) et le corps genouillé latéral (CGL). Ceci a pour effet d’acheminer l’information auditive vers le cortex visuel primaire (V1) durant le développement postnatal. À l’aide de ce modèle, l’objectif général de ce projet de thèse est d’étudier l’influence et le rôle de l’activité sensorielle sur la distribution et l’organisation des interneurones corticaux immunoréactifs aux CaBPs dans les aires sensorielles visuelle et auditive primaires du hamster adulte. Les changements dans l’expression des CaBPs ont été déterminés d’une manière quantitative en évaluant les profils de distribution laminaire de ces neurones révélés par immunohistochimie. Dans une première expérience, nous avons étudié la distribution laminaire des CaBPs dans les aires visuelle (V1) et auditive (A1) primaires chez le hamster normal adulte. Les neurones immunoréactifs à la PV et la CB, mais non à la CR, sont distribués différemment dans ces deux cortex primaires dédiés à une modalité sensorielle différente. Dans une deuxième étude, une comparaison a été effectuée entre des animaux contrôles et des hamsters énucléés à la naissance. Cette étude montre que le cortex visuel primaire de ces animaux adopte une chimioarchitecture en PV similaire à celle du cortex auditif. Nos recherches montrent donc qu’une suppression de l’activité visuelle à la naissance peut influencer l’expression des CaBPs dans l’aire V1 du hamster adulte. Ceci suggère également que le type d’activité des afférences en provenance d’autres modalités sensorielles peut moduler, en partie, une circuiterie corticale en CaBPs qui lui est propre dans le cortex hôte ou recruté. Ainsi, nos travaux appuient l’hypothèse selon laquelle il serait possible que certaines de ces sous-populations d’interneurones GABAergiques jouent un rôle crucial dans le phénomène de la plasticité intermodale.