52 resultados para audio-visual information
Resumo:
The Audio/Visual Emotion Challenge and Workshop (AVEC 2011) is the first competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and audiovisual emotion analysis, with all participants competing under strictly the same conditions. This paper first describes the challenge participation conditions. Next follows the data used – the SEMAINE corpus – and its partitioning into train, development, and test partitions for the challenge with labelling in four dimensions, namely activity, expectation, power, and valence. Further, audio and video baseline features are introduced as well as baseline results that use these features for the three sub-challenges of audio, video, and audiovisual emotion recognition.
Resumo:
Recent debates about media literacy and the internet have begun to acknowledge the importance of active user-engagement and interaction. It is not enough simply to access material online, but also to comment upon it and re-use. Yet how do these new user expectations fit within digital initiatives which increase access to audio-visual-content but which prioritise access and preservation of archives and online research rather than active user-engagement? This article will address these issues of media literacy in relation to audio-visual content. It will consider how these issues are currently being addressed, focusing particularly on the high-profile European initiative EUscreen. EUscreen brings together 20 European television archives into a single searchable database of over 40,000 digital items. Yet creative re-use restrictions and copyright issues prevent users from re-working the material they find on the site. Instead of re-use, EUscreen instead offers access and detailed contextualisation of its collection of material. But if the emphasis for resources within an online environment rests no longer upon access but on user-engagement, what does EUscreen and similar sites offer to different users?
Resumo:
This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.
Resumo:
Direct experience of social work in another country is making an increasingly important contribution to internationalising the social work academic curriculum together with the cultural competency of students. However at present this opportunity is still restricted to a limited number of students. The aim of this paper is to describe and reflect on the production of an audio-visual presentation as representing the experience of three students who participated in an exchange with a social work programme in Pune, India. It describes and assesses the rationale, production and use of video to capture student learning from the Belfast/Pune exchange. We also describe the use of the video in a classroom setting with a year group of 53 students from a younger cohort. This exercise aimed to stimulate students’ curiosity about international dimensions of social work and add to their awareness of poverty, social justice, cultural competence and community social work as global issues. Written classroom feedback informs our discussion of the technical as well as the pedagogical benefits and challenges of this approach. We conclude that some benefit of audio-visual presentation in helping students connect with diverse cultural contexts, but that a complementary discussion challenging stereotyped viewpoints and unconscious professional imperialism is also crucial.
Resumo:
Rapid orientating movements of the eyes are believed to be controlled ballistically. The mechanism underlying this control is thought to involve a comparison between the desired displacement of the eye and an estimate of its actual position (obtained from the integration of the eye velocity signal). This study shows, however, that under certain circumstances fast gaze movements may be controlled quite differently and may involve mechanisms which use visual information to guide movements prospectively. Subjects were required to make large gaze shifts in yaw towards a target whose location and motion were unknown prior to movement onset. Six of those tested demonstrated remarkable accuracy when making gaze shifts towards a target that appeared during their ongoing movement. In fact their level of accuracy was not significantly different from that shown when they performed a 'remembered' gaze shift to a known stationary target (F-3,F-15 = 0.15, p > 0.05). The lack of a stereotypical relationship between the skew of the gaze velocity profile and movement duration indicates that on-line modifications were being made. It is suggested that a fast route from the retina to the superior colliculus could account for this behaviour and that models of oculomotor control need to be updated.
Resumo:
For the first time in this paper the authors present results showing the effect of out of plane speaker head pose variation on a lip biometric based speaker verification system. Using appearance DCT based features, they adopt a Mutual Information analysis technique to highlight the class discriminant DCT components most robust to changes in out of plane pose. Experiments are conducted using the initial phase of a new multi view Audio-Visual database designed for research and development of pose-invariant speech and speaker recognition. They show that verification performance can be improved by substituting higher order horizontal DCT components for vertical, particularly in the case of a train/test pose angle mismatch.
Resumo:
Recent studies suggested that the control of hand movements in catching involves continuous vision-based adjustments. More insight into these adjustments may be gained by examining the effects of occluding different parts of the ball trajectory. Here, we examined the effects of such occlusion on lateral hand movements when catching balls approaching from different directions, with the occlusion conditions presented in blocks or in randomized order. The analyses showed that late occlusion only had an effect during the blocked presentation, and early occlusion only during the randomized presentation. During the randomized presentation movement biases were more leftward if the preceding trial was an early occlusion trial. The effect of early occlusion during the randomized presentation suggests that the observed leftward movement bias relates to the rightward visual acceleration inherent to the ball trajectories used, while its absence during the blocked presentation seems to reflect trial-by-trial adaptations in the visuomotor gain, reminiscent of dynamic gain control in the smooth pursuit system. The movement biases during the late occlusion block were interpreted in terms of an incomplete motion extrapolation--a reduction of the velocity gain--caused by the fact that participants never saw the to-be-extrapolated part of the ball trajectory. These results underscore that continuous movement adjustments for catching do not only depend on visual information, but also on visuomotor adaptations based on non-visual information.
Resumo:
The Routledge Guide to Interviewing sets out a well-tested and practical approach and methodology: what works, difficulties and dangers to avoid and key questions which must be answered before you set out. Background methodological issues and arguments are considered and drawn upon but the focus is on what is ethical, legally acceptable and productive:
-Rationale (why, what for, where, how)
-Ethics and Legalities (informed consent, data protection, risks, embargoes)
-Resources (organisational, technical, intellectual)
-Preparation (selecting and approaching interviewees, background and biographical research, establishing credentials, identifying topics)
-Technique (developing expertise and confidence)
-Audio-visual interviews
-Analysis (modes, methods, difficulties)
-Storage (archiving and long-term preservation)
-Sharing Resources (dissemination and development)
From death row to the mansion of a head of state, small kitchens and front parlours, to legislatures and presbyteries, Anna Bryson and Seán McConville’s wide interviewing experience has been condensed into this book. The material set out here has been acquired by trial, error and reflection over a period of more than four decades. The interviewees have ranged from the delightfully straightforward to the painfully difficult to the near impossible – with a sprinkling of those that were impossible.
Successful interviewing draws on the survival skills of everyday life. This guide will help you to adapt, develop and apply these innate skills. Including a range of useful information such as sample waivers, internet resources, useful hints and checklists, it provides sound and plain-speaking support for the oral historian, social scientist and investigator.
Resumo:
Previous research has shown that Parkinson's disease (PD) patients can increase the speed of their movement when catching a moving ball compared to when reaching for a static ball (Majsak et al., 1998). A recent model proposed by Redgrave et al. (2010) explains this phenomenon with regard to the dichotomic organization of motor loops in the basal ganglia circuitry and the role of sensory micro-circuitries in the control of goal-directed actions. According to this model, external visual information that is relevant to the required movement can induce a switch from a habitual control of movement toward an externally-paced, goal-directed form of guidance, resulting in augmented motor performance (Bienkiewicz et al., 2013). In the current study, we investigated whether continuous acoustic information generated by an object in motion can enhance motor performance in an arm reaching task in a similar way to that observed in the studies of Majsak et al. (1998, 2008). In addition, we explored whether the kinematic aspects of the movement are regulated in accordance with time to arrival information generated by the ball's motion as it reaches the catching zone. A group of 7 idiopathic PD (6 male, 1 female) patients performed a ball-catching task where the acceleration (and hence ball velocity) was manipulated by adjusting the angle of the ramp. The type of sensory information (visual and/or auditory) specifying the ball's arrival at the catching zone was also manipulated. Our results showed that patients with PD demonstrate improved motor performance when reaching for a ball in motion, compared to when stationary. We observed how PD patients can adjust their movement kinematics in accordance with the speed of a moving target, even if vision of the target is occluded and patients have to rely solely on auditory information. We demonstrate that the availability of dynamic temporal information is crucial for eliciting motor improvements in PD. Furthermore, these effects appear independent from the sensory modality through-which the information is conveyed.