19 resultados para Audio-visual materials


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Existing referencing systems frequently prove inadequate for the citation of moving image and sound media such as vidcasts, streaming television, sound files, un-catalogued archive footage, amateur content hosted online or
non-broadcast radio recordings. Back in 2009 and 2010 a British working group funded by Higher Education Funding Council for England (HEFCE) and co-ordinated by the British Universities Film and Video Council investigated this problem. This report documents the early stages of the project.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a novel method of audio-visual fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there is a limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new representation and a modified cosine similarity are introduced for combining and comparing bimodal features with limited training data as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal data set created from the SPIDRE and AR databases with variable noise corruption of speech and occlusion in the face images. The new method has demonstrated improved recognition accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Through the concept of sonic resonance, the project Cidade Museu – Museum City explores five derelict or transitional spaces in the city of Viseu. The activation and capture of these spaces develops an audio- visual memory that reflects architectures, stories and experiences, while creating a sense of place through sounds and images.

The project brings together musicians with a background in contemporary music, electroacoustic music and improvisation and a visual artist focusing on photography and video.

Each member of the collective explores the selected spaces in order to activate them with the help of their respective instruments and through sound projection in an iterative process in which the source of activation gradually gives way to the characteristics of each space, their resonances and acoustic characteristics. The museum city (a nickname for the city of Viseu), in this performance, exposes the contrast between the grandeur and multi-faceted architecture of Viseu’s Cathedral with spaces that spread throughout the city waiting for a new future.

The performance in the Cathedral (Sé) is characterised by a trio ensemble, an eight channel sound system and video projecting audio recordings and images made in each of the five spaces. The audience is invited to explore the relations between the various buildings and their stories while being immersed in their resonances and visual projections.

The performance explores the following spaces in Viseu: the old Orfeão (music hall), an old wine cellar, a mansion home to the national road services, a house with its grounds in Rua Silva Gaio and an old slaughterhouse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human listeners seem to be remarkably able to recognise acoustic sound sources based on timbre cues. Here we describe a psychophysical paradigm to estimate the time it takes to recognise a set of complex sounds differing only in timbre cues: both in terms of the minimum duration of the sounds and the inferred neural processing time. Listeners had to respond to the human voice while ignoring a set of distractors. All sounds were recorded from natural sources over the same pitch range and equalised to the same duration and power. In a first experiment, stimuli were gated in time with a raised-cosine window of variable duration and random onset time. A voice/non-voice (yes/no) task was used. Performance, as measured by d', remained above chance for the shortest sounds tested (2 ms); d's above 1 were observed for durations longer than or equal to 8 ms. Then, we constructed sequences of short sounds presented in rapid succession. Listeners were asked to report the presence of a single voice token that could occur at a random position within the sequence. This method is analogous to the "rapid sequential visual presentation" paradigm (RSVP), which has been used to evaluate neural processing time for images. For 500-ms sequences made of 32-ms and 16-ms sounds, d' remained above chance for presentation rates of up to 30 sounds per second. There was no effect of the pitch relation between successive sounds: identical for all sounds in the sequence or random for each sound. This implies that the task was not determined by streaming or forward masking, as both phenomena would predict better performance for the random pitch condition. Overall, the recognition of familiar sound categories such as the voice seems to be surprisingly fast, both in terms of the acoustic duration required and of the underlying neural time constants.