3 resultados para Multimedia browsing
em Digital Peer Publishing
Resumo:
Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.
Resumo:
In dem Beitrag geht es um einen Rückblick auf die Anfänge des manipulierbaren, digitalen Bildes im Kontext der Kunst. An einem Fallbeispiel, dem Johannesaltar des Rogier van der Weyden, erfolgt eine praktische Auseinandersetzung mit den technologisch bedingten Grenzen der Bildanalyse. Dies mündet in eine kritische Bilanz vorfabrizierter Wissensvermittlung und Sichtbarkeitserfahrung nicht nur im Bereich der Kunstgeschichte. Die These lautet von daher ganz allgemein gefasst: Tradierte Bildlichkeit fristet ihr Dasein nicht mehr allein im musealen Raum, sondern ist bereits in einen technologisch bedingten transferiert.
Resumo:
This article proposes a new focus of research for multimedia conferencing systems which allows a participant to flexibly select another participant or a group for media transmission. For example, in a traditional conference system, participants voices might by default be shared with all others, but one might want to select a subset of the conference members to send his/her media to or receive media from. We review the concept of narrowcasting, a model for limiting such information streams in a multimedia conference, and describe a design to use existing standard protocols (SIP and SDP) for controlling fine-grained narrowcasting sessions.