838 resultados para Multimodal perception
Resumo:
Virtual Worlds Generator is a grammatical model that is proposed to define virtual worlds. It integrates the diversity of sensors and interaction devices, multimodality and a virtual simulation system. Its grammar allows the definition and abstraction in symbols strings of the scenes of the virtual world, independently of the hardware that is used to represent the world or to interact with it. A case study is presented to explain how to use the proposed model to formalize a robot navigation system with multimodal perception and a hybrid control scheme of the robot.
Resumo:
Virtual Worlds Generator is a grammatical model that is proposed to define virtual worlds. It integrates the diversity of sensors and interaction devices, multimodality and a virtual simulation system. Its grammar allows the definition and abstraction in symbols strings of the scenes of the virtual world, independently of the hardware that is used to represent the world or to interact with it. A case study is presented to explain how to use the proposed model to formalize a robot navigation system with multimodal perception and a hybrid control scheme of the robot. The result is an instance of the model grammar that implements the robotic system and is independent of the sensing devices used for perception and interaction. As a conclusion the Virtual Worlds Generator adds value in the simulation of virtual worlds since the definition can be done formally and independently of the peculiarities of the supporting devices.
Resumo:
This study explored the critical features of temporal synchrony for the facilitation of prenatal perceptual learning with respect to unimodal stimulation using an animal model, the bobwhite quail. The following related hypotheses were examined: (1) the availability of temporal synchrony is a critical feature to facilitate prenatal perceptual learning, (2) a single temporally synchronous note is sufficient to facilitate prenatal perceptual learning, with respect to unimodal stimulation, and (3) in situations where embryos are exposed to a single temporally synchronous note, facilitated perceptual learning, with respect to unimodal stimulation, will be optimal when the temporally synchronous note occurs at the onset of the stimulation bout. To assess these hypotheses, two experiments were conducted in which quail embryos were exposed to various audio-visual configurations of a bobwhite maternal call and tested at 24 hr after hatching for evidence of facilitated prenatal perceptual learning with respect to unimodal stimulation. Experiment 1 explored if intermodal equivalence was sufficient to facilitate prenatal perceptual learning with respect to unimodal stimulation. A Bimodal Sequential Temporal Equivalence (BSTE) condition was created that provided embryos with sequential auditory and visual stimulation in which the same amodal properties (rate, duration, rhythm) were made available across modalities. Experiment 2 assessed: (a) whether a limited number of temporally synchronous notes are sufficient for facilitated prenatal perceptual learning with respect to unimodal stimulation, and (b) whether there is a relationship between timing of occurrence of a temporally synchronous note and the facilitation of prenatal perceptual learning. Results revealed that prenatal exposure to BSTE was not sufficient to facilitate perceptual learning. In contrast, a maternal call that contained a single temporally synchronous note was sufficient to facilitate embryos’ prenatal perceptual learning with respect to unimodal stimulation. Furthermore, the most salient prenatal condition was that which contained the synchronous note at the onset of the call burst. Embryos’ prenatal perceptual learning of the call was four times faster in this condition than when exposed to a unimodal call. Taken together, bobwhite quail embryos’ remarkable sensitivity to temporal synchrony suggests that this amodal property plays a key role in attention and learning during prenatal development.
Resumo:
Motor learning is based on motor perception and emergent perceptual-motor representations. A lot of behavioral research is related to single perceptual modalities but during last two decades the contribution of multimodal perception on motor behavior was discovered more and more. A growing number of studies indicates an enhanced impact of multimodal stimuli on motor perception, motor control and motor learning in terms of better precision and higher reliability of the related actions. Behavioral research is supported by neurophysiological data, revealing that multisensory integration supports motor control and learning. But the overwhelming part of both research lines is dedicated to basic research. Besides research in the domains of music, dance and motor rehabilitation, there is almost no evidence for enhanced effectiveness of multisensory information on learning of gross motor skills. To reduce this gap, movement sonification is used here in applied research on motor learning in sports. Based on the current knowledge on the multimodal organization of the perceptual system, we generate additional real-time movement information being suitable for integration with perceptual feedback streams of visual and proprioceptive modality. With ongoing training, synchronously processed auditory information should be initially integrated into the emerging internal models, enhancing the efficacy of motor learning. This is achieved by a direct mapping of kinematic and dynamic motion parameters to electronic sounds, resulting in continuous auditory and convergent audiovisual or audio-proprioceptive stimulus arrays. In sharp contrast to other approaches using acoustic information as error-feedback in motor learning settings, we try to generate additional movement information suitable for acceleration and enhancement of adequate sensorimotor representations and processible below the level of consciousness. In the experimental setting, participants were asked to learn a closed motor skill (technique acquisition of indoor rowing). One group was treated with visual information and two groups with audiovisual information (sonification vs. natural sounds). For all three groups learning became evident and remained stable. Participants treated with additional movement sonification showed better performance compared to both other groups. Results indicate that movement sonification enhances motor learning of a complex gross motor skill-even exceeding usually expected acoustic rhythmic effects on motor learning.
Resumo:
Motion is an important aspect of face perception that has been largely neglected to date. Many of the established findings are based on studies that use static facial images, which do not reflect the unique temporal dynamics available from seeing a moving face. In the present thesis a set of naturalistic dynamic facial emotional expressions was purposely created and used to investigate the neural structures involved in the perception of dynamic facial expressions of emotion, with both functional Magnetic Resonance Imaging (fMRI) and Magnetoencephalography (MEG). Through fMRI and connectivity analysis, a dynamic face perception network was identified, which is demonstrated to extend the distributed neural system for face perception (Haxby et al.,2000). Measures of effective connectivity between these regions revealed that dynamic facial stimuli were associated with specific increases in connectivity between early visual regions, such as inferior occipital gyri and superior temporal sulci, along with coupling between superior temporal sulci and amygdalae, as well as with inferior frontal gyri. MEG and Synthetic Aperture Magnetometry (SAM) were used to examine the spatiotemporal profile of neurophysiological activity within this dynamic face perception network. SAM analysis revealed a number of regions showing differential activation to dynamic versus static faces in the distributed face network, characterised by decreases in cortical oscillatory power in the beta band, which were spatially coincident with those regions that were previously identified with fMRI. These findings support the presence of a distributed network of cortical regions that mediate the perception of dynamic facial expressions, with the fMRI data providing information on the spatial co-ordinates paralleled by the MEG data, which indicate the temporal dynamics within this network. This integrated multimodal approach offers both excellent spatial and temporal resolution, thereby providing an opportunity to explore dynamic brain activity and connectivity during face processing.
Resumo:
Report for the scientific sojourn carried out at the University Medical Center, Swiss, from 2010 to 2012. Abundant evidence suggests that negative emotional stimuli are prioritized in the perceptual systems, eliciting enhanced neural responses in early sensory regions as compared with neutral information. This facilitated detection is generally paralleled by larger neural responses in early sensory areas, relative to the processing of neutral information. In this sense, the amygdala and other limbic regions, such as the orbitofrontal cortex, may play a critical role by sending modulatory projections onto the sensory cortices via direct or indirect feedback.The present project aimed at investigating two important issues regarding these mechanisms of emotional attention, by means of functional magnetic resonance imaging. In Study I, we examined the modulatory effects of visual emotion signals on the processing of task-irrelevant visual, auditory, and somatosensory input, that is, the intramodal and crossmodal effects of emotional attention. We observed that brain responses to auditory and tactile stimulation were enhanced during the processing of visual emotional stimuli, as compared to neutral, in bilateral primary auditory and somatosensory cortices, respectively. However, brain responses to visual task-irrelevant stimulation were diminished in left primary and secondary visual cortices in the same conditions. The results also suggested the existence of a multimodal network associated with emotional attention, presumably involving mediofrontal, temporal and orbitofrontal regions Finally, Study II examined the different brain responses along the low-level visual pathways and limbic regions, as a function of the number of retinal spikes during visual emotional processing. The experiment used stimuli resulting from an algorithm that simulates how the visual system perceives a visual input after a given number of retinal spikes. The results validated the visual model in human subjects and suggested differential emotional responses in the amygdala and visual regions as a function of spike-levels. A list of publications resulting from work in the host laboratory is included in the report.
Resumo:
Report for the scientific sojourn carried out at the University Medical Center, Swiss, from 2010 to 2012. Abundant evidence suggests that negative emotional stimuli are prioritized in the perceptual systems, eliciting enhanced neural responses in early sensory regions as compared with neutral information. This facilitated detection is generally paralleled by larger neural responses in early sensory areas, relative to the processing of neutral information. In this sense, the amygdala and other limbic regions, such as the orbitofrontal cortex, may play a critical role by sending modulatory projections onto the sensory cortices via direct or indirect feedback.The present project aimed at investigating two important issues regarding these mechanisms of emotional attention, by means of functional magnetic resonance imaging. In Study I, we examined the modulatory effects of visual emotion signals on the processing of task-irrelevant visual, auditory, and somatosensory input, that is, the intramodal and crossmodal effects of emotional attention. We observed that brain responses to auditory and tactile stimulation were enhanced during the processing of visual emotional stimuli, as compared to neutral, in bilateral primary auditory and somatosensory cortices, respectively. However, brain responses to visual task-irrelevant stimulation were diminished in left primary and secondary visual cortices in the same conditions. The results also suggested the existence of a multimodal network associated with emotional attention, presumably involving mediofrontal, temporal and orbitofrontal regions Finally, Study II examined the different brain responses along the low-level visual pathways and limbic regions, as a function of the number of retinal spikes during visual emotional processing. The experiment used stimuli resulting from an algorithm that simulates how the visual system perceives a visual input after a given number of retinal spikes. The results validated the visual model in human subjects and suggested differential emotional responses in the amygdala and visual regions as a function of spike-levels. A list of publications resulting from work in the host laboratory is included in the report.
Resumo:
Résumé: Les récents progrès techniques de l'imagerie cérébrale non invasives ont permis d'améliorer la compréhension des différents systèmes fonctionnels cérébraux. Les approches multimodales sont devenues indispensables en recherche, afin d'étudier dans sa globalité les différentes caractéristiques de l'activité neuronale qui sont à la base du fonctionnement cérébral. Dans cette étude combinée d'imagerie par résonance magnétique fonctionnelle (IRMf) et d'électroencéphalographie (EEG), nous avons exploité le potentiel de chacune d'elles, soit respectivement la résolution spatiale et temporelle élevée. Les processus cognitifs, de perception et de mouvement nécessitent le recrutement d'ensembles neuronaux. Dans la première partie de cette thèse nous étudions, grâce à la combinaison des techniques IRMf et EEG, la réponse des aires visuelles lors d'une stimulation qui demande le regroupement d'éléments cohérents appartenant aux deux hémi-champs visuels pour en faire une seule image. Nous utilisons une mesure de synchronisation (EEG de cohérence) comme quantification de l'intégration spatiale inter-hémisphérique et la réponse BOLD (Blood Oxygenation Level Dependent) pour évaluer l'activité cérébrale qui en résulte. L'augmentation de la cohérence de l'EEG dans la bande beta-gamma mesurée au niveau des électrodes occipitales et sa corrélation linéaire avec la réponse BOLD dans les aires de VP/V4, reflète et visualise un ensemble neuronal synchronisé qui est vraisemblablement impliqué dans le regroupement spatial visuel. Ces résultats nous ont permis d'étendre la recherche à l'étude de l'impact que le contenu en fréquence des stimuli a sur la synchronisation. Avec la même approche, nous avons donc identifié les réseaux qui montrent une sensibilité différente à l'intégration des caractéristiques globales ou détaillées des images. En particulier, les données montrent que l'implication des réseaux visuels ventral et dorsal est modulée par le contenu en fréquence des stimuli. Dans la deuxième partie nous avons a testé l'hypothèse que l'augmentation de l'activité cérébrale pendant le processus de regroupement inter-hémisphérique dépend de l'activité des axones calleux qui relient les aires visuelles. Comme le Corps Calleux présente une maturation progressive pendant les deux premières décennies, nous avons analysé le développement de la fonction d'intégration spatiale chez des enfants âgés de 7 à 13 ans et le rôle de la myelinisation des fibres calleuses dans la maturation de l'activité visuelle. Nous avons combiné l'IRMf et la technique de MTI (Magnetization Transfer Imaging) afin de suivre les signes de maturation cérébrale respectivement sous l'aspect fonctionnel et morphologique (myelinisation). Chez lés enfants, les activations associées au processus d'intégration entre les hémi-champs visuels sont, comme chez l'adulte, localisées dans le réseau ventral mais se limitent à une zone plus restreinte. La forte corrélation que le signal BOLD montre avec la myelinisation des fibres du splenium est le signe de la dépendance entre la maturation des fonctions visuelles de haut niveau et celle des connections cortico-corticales. Abstract: Recent advances in non-invasive brain imaging allow the visualization of the different aspects of complex brain dynamics. The approaches based on a combination of imaging techniques facilitate the investigation and the link of multiple aspects of information processing. They are getting a leading tool for understanding the neural basis of various brain functions. Perception, motion, and cognition involve the formation of cooperative neuronal assemblies distributed over the cerebral cortex. In this research, we explore the characteristics of interhemispheric assemblies in the visual brain by taking advantage of the complementary characteristics provided by EEG (electroencephalography) and fMRI (Functional Magnetic Resonance Imaging) techniques. These are the high temporal resolution for EEG and high spatial resolution for fMRI. In the first part of this thesis we investigate the response of the visual areas to the interhemispheric perceptual grouping task. We use EEG coherence as a measure of synchronization and BOLD (Blood Oxygenar tion Level Dependent) response as a measure of the related brain activation. The increase of the interhemispheric EEG coherence restricted to the occipital electrodes and to the EEG beta band and its linear relation to the BOLD responses in VP/V4 area points to a trans-hemispheric synchronous neuronal assembly involved in early perceptual grouping. This result encouraged us to explore the formation of synchronous trans-hemispheric networks induced by the stimuli of various spatial frequencies with this multimodal approach. We have found the involvement of ventral and medio-dorsal visual networks modulated by the spatial frequency content of the stimulus. Thus, based on the combination of EEG coherence and fMRI BOLD data, we have identified visual networks with different sensitivity to integrating low vs. high spatial frequencies. In the second part of this work we test the hypothesis that the increase of brain activity during perceptual grouping depends on the activity of callosal axons interconnecting the visual areas that are involved. To this end, in children of 7-13 years, we investigated functional (functional activation with fMRI) and morphological (myelination of the corpus callosum with Magnetization Transfer Imaging (MTI)) aspects of spatial integration. In children, the activation associated with the spatial integration across visual fields was localized in visual ventral stream and limited to a part of the area activated in adults. The strong correlation between individual BOLD responses in .this area and the myelination of the splenial system of fibers points to myelination as a significant factor in the development of the spatial integration ability.
Resumo:
Accurate perception of taste information is crucial for animal survival. In adult Drosophila, gustatory receptor neurons (GRNs) perceive chemical stimuli of one specific gustatory modality associated with a stereotyped behavioural response, such as aversion or attraction. We show that GRNs of Drosophila larvae employ a surprisingly different mode of gustatory information coding. Using a novel method for calcium imaging in the larval gustatory system, we identify a multimodal GRN that responds to chemicals of different taste modalities with opposing valence, such as sweet sucrose and bitter denatonium, reliant on different sensory receptors. This multimodal neuron is essential for bitter compound avoidance, and its artificial activation is sufficient to mediate aversion. However, the neuron is also essential for the integration of taste blends. Our findings support a model for taste coding in larvae, in which distinct receptor proteins mediate different responses within the same, multimodal GRN.
Resumo:
The general goal of the present work was to study whether spatial perceptual asymmetry initially observed in linguistic dichotic listening studies is related to the linguistic nature of the stimuli and/or is modality-specific, as well as to investigate whether the spatial perceptual/attentional asymmetry changes as a function of age and sensory deficit via praxis. Several dichotic listening studies with linguistic stimuli have shown that the inherent perceptual right ear advantage (REA), which presumably results from the left lateralized linguistic functions (bottom-up processes), can be modified with executive functions (top-down control). Executive functions mature slowly during childhood, are well developed in adulthood, and decline as a function of ageing. In Study I, the purpose was to investigate with a cross-sectional experiment from a lifespan perspective the age-related changes in top-down control of REA for linguistic stimuli in dichotic listening with a forced-attention paradigm (DL). In Study II, the aim was to determine whether the REA is linguistic-stimulus-specific or not, and whether the lifespan changes in perceptual asymmetry observed in dichotic listening would exist also in auditory spatial attention tasks that put load on attentional control. In Study III, using visual spatial attention tasks, mimicking the auditory tasks applied in Study II, it was investigated whether or not the stimulus-non-specific rightward spatial bias found in auditory modality is a multimodal phenomenon. Finally, as it has been suggested that the absence of visual input in blind participants leads to improved auditory spatial perceptual and cognitive skills, the aim in Study IV was to determine, whether blindness modifies the ear advantage in DL. Altogether 180-190 right-handed participants between 5 and 79 years of age were studied in Studies I to III, and in Study IV the performance of 14 blind individuals was compared with that of 129 normally sighted individuals. The results showed that only rightward spatial bias was observed in tasks with intensive attentional load, independent of the type of stimuli (linguistic vs. non-linguistic) or the modality (auditory vs. visual). This multimodal rightward spatial bias probably results from a complex interaction of asymmetrical perceptual, attentional, and/or motor mechanisms. Most importantly, the strength of the rightward spatial bias changed as a function of age and augmented praxis due to sensory deficit. The efficiency of the performance in spatial attention tasks and the ability to overcome the rightward spatial bias increased during childhood, was at its best in young adulthood, and decreased as a function of ageing. Between the ages of 5 and 11 years probably at first develops movement and impulse control, followed by the gradual development of abilities to inhibit distractions and disengage attention. The errors especially in bilateral stimulus conditions suggest that a mild phenomenon resembling extinction can be observed throughout the lifespan, but especially the ability to distribute attention to multiple targets simultaneously decreases in the course of ageing. Blindness enhances the processing of auditory bilateral linguistic stimuli, the ability to overcome a stimulus-driven laterality effect related to speech sound perception, and the ability to direct attention to an appropriate spatial location. It was concluded that the ability to voluntarily suppress and inhibit the multimodal rightward spatial bias changes as a function of age and praxis due to sensory deficit and probably reflects the developmental level of executive functions.
Resumo:
For people with motion impairments, access to and independent control of a computer can be essential. Symptoms such as tremor and spasm, however, can make the typical keyboard and mouse arrangement for computer interaction difficult or even impossible to use. This paper describes three approaches to improving computer input effectivness for people with motion impairments. The three approaches are: (1) to increase the number of interaction channels, (2) to enhance commonly existing interaction channels, and (3) to make more effective use of all the available information in an existing input channel. Experiments in multimodal input, haptic feedback, user modelling, and cursor control are discussed in the context of the three approaches. A haptically enhanced keyboard emulator with perceptive capability is proposed, combining approaches in a way that improves computer access for motion impaired users.
Resumo:
Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.
Resumo:
The present dissertation aims at analyzing the construction of American adolescent culture through teen-targeted television series and the shift in perception that occurs as a consequence of the translation process. In light of the recent changes in television production and consumption modes, largely caused by new technologies, this project explores the evolution of Italian audiences, focusing on fansubbing (freely distributed amateur subtitles made by fans for fan consumption) and social viewing (the re-aggregation of television consumption based on social networks and dedicated platforms, rather than on physical presence). These phenomena are symptoms of a sort of ‘viewership 2.0’ and of a new type of active viewing, which calls for a revision of traditional AVT strategies. Using a framework that combines television studies, new media studies, and fandom studies with an approach to AVT based on Descriptive Translation Studies (Toury 1995), this dissertation analyzes the non-Anglophone audience’s growing need to participation in the global dialogue and appropriation process based on US scheduling and informed by the new paradigm of convergence culture, transmedia storytelling, and affective economics (Jenkins 2006 and 2007), as well as the constraints intrinsic to multimodal translation and the different types of linguistic and cultural adaptation performed through dubbing (which tends to be more domesticating; Venuti 1995) and fansubbing (typically more foreignizing). The study analyzes a selection of episodes from six of the most popular teen television series between 1990 and 2013, which has been divided into three ages based on the different modes of television consumption: top-down, pre-Internet consumption (Beverly Hills, 90210, 1990 – 2000), emergence of audience participation (Buffy the Vampire Slayer, 1997 – 2003; Dawson’s Creek, 1998 – 2003), age of convergence and Viewership 2.0 (Gossip Girl, 2007 – 2012; Glee, 2009 – present; The Big Bang Theory, 2007 - present).
Resumo:
Recent advances in the field of statistical learning have established that learners are able to track regularities of multimodal stimuli, yet it is unknown whether the statistical computations are performed on integrated representations or on separate, unimodal representations. In the present study, we investigated the ability of adults to integrate audio and visual input during statistical learning. We presented learners with a speech stream synchronized with a video of a speaker's face. In the critical condition, the visual (e.g., /gi/) and auditory (e.g., /mi/) signals were occasionally incongruent, which we predicted would produce the McGurk illusion, resulting in the perception of an audiovisual syllable (e.g., /ni/). In this way, we used the McGurk illusion to manipulate the underlying statistical structure of the speech streams, such that perception of these illusory syllables facilitated participants' ability to segment the speech stream. Our results therefore demonstrate that participants can integrate audio and visual input to perceive the McGurk illusion during statistical learning. We interpret our findings as support for modality-interactive accounts of statistical learning.
Resumo:
My dissertation emphasizes a cognitive account of multimodality that explicitly integrates experiential knowledge work into the rhetorical pedagogy that informs so many composition and technical communication programs. In these disciplines, multimodality is widely conceived in terms of what Gunther Kress calls “socialsemiotic” modes of communication shaped primarily by culture. In the cognitive and neurolinguistic theories of Vittorio Gallese and George Lakoff, however, multimodality is described as a key characteristic of our bodies’ sensory-motor systems which link perception to action and action to meaning, grounding all communicative acts in knowledge shaped through body-engaged experience. I argue that this “situated” account of cognition – which closely approximates Maurice Merleau-Ponty’s phenomenology of perception, a major framework for my study – has pedagogical precedence in the mimetic pedagogy that informed ancient Sophistic rhetorical training, and I reveal that training’s multimodal dimensions through a phenomenological exegesis of the concept mimesis. Plato’s denigration of the mimetic tradition and his elevation of conceptual contemplation through reason, out of which developed the classic Cartesian separation of mind from body, resulted in a general degradation of experiential knowledge in Western education. But with the recent introduction into college classrooms of digital technologies and multimedia communication tools, renewed emphasis is being placed on the “hands-on” nature of inventive and productive praxis, necessitating a revision of methods of instruction and assessment that have traditionally privileged the acquisition of conceptual over experiential knowledge. The model of multimodality I construct from Merleau-Ponty’s phenomenology, ancient Sophistic rhetorical pedagogy, and current neuroscientific accounts of situated cognition insists on recognizing the significant role knowledges we acquire experientially play in our reading and writing, speaking and listening, discerning and designing practices.