945 resultados para Audio-visual speaker recognition
Resumo:
The HMAX model has recently been proposed by Riesenhuber & Poggio as a hierarchical model of position- and size-invariant object recognition in visual cortex. It has also turned out to model successfully a number of other properties of the ventral visual stream (the visual pathway thought to be crucial for object recognition in cortex), and particularly of (view-tuned) neurons in macaque inferotemporal cortex, the brain area at the top of the ventral stream. The original modeling study only used ``paperclip'' stimuli, as in the corresponding physiology experiment, and did not explore systematically how model units' invariance properties depended on model parameters. In this study, we aimed at a deeper understanding of the inner workings of HMAX and its performance for various parameter settings and ``natural'' stimulus classes. We examined HMAX responses for different stimulus sizes and positions systematically and found a dependence of model units' responses on stimulus position for which a quantitative description is offered. Interestingly, we find that scale invariance properties of hierarchical neural models are not independent of stimulus class, as opposed to translation invariance, even though both are affine transformations within the image plane.
Resumo:
Numerous psychophysical experiments have shown an important role for attentional modulations in vision. Behaviorally, allocation of attention can improve performance in object detection and recognition tasks. At the neural level, attention increases firing rates of neurons in visual cortex whose preferred stimulus is currently attended to. However, it is not yet known how these two phenomena are linked, i.e., how the visual system could be "tuned" in a task-dependent fashion to improve task performance. To answer this question, we performed simulations with the HMAX model of object recognition in cortex [45]. We modulated firing rates of model neurons in accordance with experimental results about effects of feature-based attention on single neurons and measured changes in the model's performance in a variety of object recognition tasks. It turned out that recognition performance could only be improved under very limited circumstances and that attentional influences on the process of object recognition per se tend to display a lack of specificity or raise false alarm rates. These observations lead us to postulate a new role for the observed attention-related neural response modulations.
Resumo:
This thesis presents there important results in visual object recognition based on shape. (1) A new algorithm (RAST; Recognition by Adaptive Sudivisions of Tranformation space) is presented that has lower average-case complexity than any known recognition algorithm. (2) It is shown, both theoretically and empirically, that representing 3D objects as collections of 2D views (the "View-Based Approximation") is feasible and affects the reliability of 3D recognition systems no more than other commonly made approximations. (3) The problem of recognition in cluttered scenes is considered from a Bayesian perspective; the commonly-used "bounded-error errorsmeasure" is demonstrated to correspond to an independence assumption. It is shown that by modeling the statistical properties of real-scenes better, objects can be recognized more reliably.
Resumo:
Aquest llibre és el producte d'anys de cooperació entre equips de recerca de cinc països diferents, tot ells Key Institutions de la xarxa Childwatch International, en el marc d'un projecte plurinacional sobre adolescents i mitjans
Resumo:
Resumen tomado de la revista
Resumo:
No está publicado.
Resumo:
El v??deo est?? realizado por profesores de las ??reas de Did??ctica y Organizaci??n Escolar y Teor??a e Historia de la Educaci??n, durante los a??os 2000 y 2001. Recoge la opini??n de profesionales, padres y madres y personas con discapacidad f??sica (sordos, ciegos y P.C.I.)y personas con discapacidad mental en relaci??n con diferentes aspectos de la vida diaria: hogar, inserci??n laboral, etc. Este recurso did??ctico est?? dise??ado para el visionado, interpretaci??n te??rico-pr??ctica y contraste de opiniones en el aula, de ense??anza superior, para abordar la formaci??n de los profesionales que van a desarrollar su actividad con personas con discapacidad.
Resumo:
Resumen en español. Resumen basado en el de la publicación
Resumo:
La función de la Lengua en el Bachillerato es triple: como factor de promoción socio-económica que permite en algunos casos obtener mejoras salariales y en otros alcanzar puestos vedados a los que no conocen idiomas, la UNESCO recomienda su estudio por su función educativa respecto al ser humano, integrante de los distintos grupos nacionales, enriquecimiento del sentido crítico y de tolerancia al apreciar las diferencias y semejanzas de los distintos pueblos, una cultura humanista que debe procurar el estudio de la lengua francesa, máxime para nosotros si tenemos en cuenta que es un país fronterizo nuestro y que permite el camino para llegar a Europa, es lógico que la lengua francesa sea tan importante para nosotros debido a las relaciones comerciales, económicas, etcétera que se desarrollan en esta lengua.; como tercera función, y primordial, el apredizaje de, por lo menos, un idioma, es primordial para la formación de la personalidad. A partir de 1975 son importantes los avances conseguidos en el estudio de un idioma, sobre todo los esfuerzos de renovación didáctica, destacando las aportaciones de la metodología estructuroglobal audiovisual, nacida a partir de los años cincuenta y que está siendo renovada constantemente. Si el alumno ha de aprender el francés a distancia debe tener un material adecuado a través de cassettes con diálogos para aprender a pronunciar correctamente. Después se aprenderá a leer y escribir porque se supone que se sabe pronunciar correctamente y el transcribir la lengua oral es un ejercicio para fijar los conocimientos. Pero el aprendizaje de un idioma debe realizarse dedicando todos los días un tiempo concreto, esta regularidad es la permite aprenderlo. Así, en cada caso el alumno deberá actuar de acuerdo con las orientaciones más precisas y personales de su profesor-tutor y con sus hábitos de trabajo siempre y cuando resulten eficaces.
Resumo:
This dissertation examines auditory perception and audio-visual reception in noise for both hearing-impaired and normal hearing persons, with a goal of determining some of the noise conditions under which amplified acoustic cues for speech can be beneficial to hearing-impaired persons.
Resumo:
This workshop paper reports recent developments to a vision system for traffic interpretation which relies extensively on the use of geometrical and scene context. Firstly, a new approach to pose refinement is reported, based on forces derived from prominent image derivatives found close to an initial hypothesis. Secondly, a parameterised vehicle model is reported, able to represent different vehicle classes. This general vehicle model has been fitted to sample data, and subjected to a Principal Component Analysis to create a deformable model of common car types having 6 parameters. We show that the new pose recovery technique is also able to operate on the PCA model, to allow the structure of an initial vehicle hypothesis to be adapted to fit the prevailing context. We report initial experiments with the model, which demonstrate significant improvements to pose recovery.
Resumo:
The encoding of goal-oriented motion events varies across different languages. Speakers of languages without grammatical aspect (e.g., Swedish) tend to mention motion endpoints when describing events, e.g., “two nuns walk to a house,”, and attach importance to event endpoints when matching scenes from memory. Speakers of aspect languages (e.g., English), on the other hand, are more prone to direct attention to the ongoingness of motion events, which is reflected both in their event descriptions, e.g., “two nuns are walking.”, and in their non-verbal similarity judgements. This study examines to what extent native speakers of Swedish (n = 82) with English as a foreign language (FL) restructure their categorisation of goal-oriented motion as a function of their English proficiency and experience with the English language (e.g., exposure, learning). Seventeen monolingual native English speakers from the United Kingdom (UK) were engaged for comparison purposes. Data on motion event cognition were collected through a memory-based triads matching task, in which a target scene with an intermediate degree of endpoint orientation was matched with two alternative scenes with low and high degrees of endpoint orientation, respectively. Results showed that the preference among the Swedish speakers of L2 English to base their similarity judgements on ongoingness rather than event endpoints was correlated with their use of English in their everyday lives, such that those who often watched television in English approximated the ongoingness preference of the English native speakers. These findings suggest that event cognition patterns may be restructured through the exposure to FL audio-visual media. The results thus add to the emerging picture that learning a new language entails learning new ways of observing and reasoning about reality.
Resumo:
Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.