903 resultados para Audio-visual content classification
Resumo:
Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.
Resumo:
In this paper, we introduce a novel high-level visual content descriptor which is devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt to bridge the so called “semantic gap”. The proposed image feature vector model is fundamentally underpinned by the image labelling framework, called Collaterally Confirmed Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts of the images with the state-of-the-art low-level image processing and visual feature extraction techniques for automatically assigning linguistic keywords to image regions. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicates that our proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.
Resumo:
A novel framework referred to as collaterally confirmed labelling (CCL) is proposed, aiming at localising the visual semantics to regions of interest in images with textual keywords. Both the primary image and collateral textual modalities are exploited in a mutually co-referencing and complementary fashion. The collateral content and context-based knowledge is used to bias the mapping from the low-level region-based visual primitives to the high-level visual concepts defined in a visual vocabulary. We introduce the notion of collateral context, which is represented as a co-occurrence matrix of the visual keywords. A collaborative mapping scheme is devised using statistical methods like Gaussian distribution or Euclidean distance together with collateral content and context-driven inference mechanism. We introduce a novel high-level visual content descriptor that is devised for performing semantic-based image classification and retrieval. The proposed image feature vector model is fundamentally underpinned by the CCL framework. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval, respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicate that the proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
In this paper, we introduce a novel high-level visual content descriptor devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt for bridging the so called "semantic gap". The proposed image feature vector model is fundamentally underpinned by an automatic image labelling framework, called Collaterally Cued Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts accompanying the images with the state-of-the-art low-level visual feature extraction techniques for automatically assigning textual keywords to image regions. A subset of the Corel image collection was used for evaluating the proposed method. The experimental results indicate that our semantic-level visual content descriptors outperform both conventional visual and textual image feature models.
Resumo:
Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.
Resumo:
Aquest llibre és el producte d'anys de cooperació entre equips de recerca de cinc països diferents, tot ells Key Institutions de la xarxa Childwatch International, en el marc d'un projecte plurinacional sobre adolescents i mitjans
Resumo:
No está publicado.
Resumo:
El v??deo est?? realizado por profesores de las ??reas de Did??ctica y Organizaci??n Escolar y Teor??a e Historia de la Educaci??n, durante los a??os 2000 y 2001. Recoge la opini??n de profesionales, padres y madres y personas con discapacidad f??sica (sordos, ciegos y P.C.I.)y personas con discapacidad mental en relaci??n con diferentes aspectos de la vida diaria: hogar, inserci??n laboral, etc. Este recurso did??ctico est?? dise??ado para el visionado, interpretaci??n te??rico-pr??ctica y contraste de opiniones en el aula, de ense??anza superior, para abordar la formaci??n de los profesionales que van a desarrollar su actividad con personas con discapacidad.
Resumo:
La función de la Lengua en el Bachillerato es triple: como factor de promoción socio-económica que permite en algunos casos obtener mejoras salariales y en otros alcanzar puestos vedados a los que no conocen idiomas, la UNESCO recomienda su estudio por su función educativa respecto al ser humano, integrante de los distintos grupos nacionales, enriquecimiento del sentido crítico y de tolerancia al apreciar las diferencias y semejanzas de los distintos pueblos, una cultura humanista que debe procurar el estudio de la lengua francesa, máxime para nosotros si tenemos en cuenta que es un país fronterizo nuestro y que permite el camino para llegar a Europa, es lógico que la lengua francesa sea tan importante para nosotros debido a las relaciones comerciales, económicas, etcétera que se desarrollan en esta lengua.; como tercera función, y primordial, el apredizaje de, por lo menos, un idioma, es primordial para la formación de la personalidad. A partir de 1975 son importantes los avances conseguidos en el estudio de un idioma, sobre todo los esfuerzos de renovación didáctica, destacando las aportaciones de la metodología estructuroglobal audiovisual, nacida a partir de los años cincuenta y que está siendo renovada constantemente. Si el alumno ha de aprender el francés a distancia debe tener un material adecuado a través de cassettes con diálogos para aprender a pronunciar correctamente. Después se aprenderá a leer y escribir porque se supone que se sabe pronunciar correctamente y el transcribir la lengua oral es un ejercicio para fijar los conocimientos. Pero el aprendizaje de un idioma debe realizarse dedicando todos los días un tiempo concreto, esta regularidad es la permite aprenderlo. Así, en cada caso el alumno deberá actuar de acuerdo con las orientaciones más precisas y personales de su profesor-tutor y con sus hábitos de trabajo siempre y cuando resulten eficaces.
Resumo:
This dissertation examines auditory perception and audio-visual reception in noise for both hearing-impaired and normal hearing persons, with a goal of determining some of the noise conditions under which amplified acoustic cues for speech can be beneficial to hearing-impaired persons.