982 resultados para visual object categorization
Resumo:
This paper focuses on the problem of realizing a plane-to-plane virtual link between a camera attached to the end-effector of a robot and a planar object. In order to do the system independent to the object surface appearance, a structured light emitter is linked to the camera so that 4 laser pointers are projected onto the object. In a previous paper we showed that such a system has good performance and nice characteristics like partial decoupling near the desired state and robustness against misalignment of the emitter and the camera (J. Pages et al., 2004). However, no analytical results concerning the global asymptotic stability of the system were obtained due to the high complexity of the visual features utilized. In this work we present a better set of visual features which improves the properties of the features in (J. Pages et al., 2004) and for which it is possible to prove the global asymptotic stability
Resumo:
In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory
Resumo:
An object's motion relative to an observer can confer ethologically meaningful information. Approaching or looming stimuli can signal threats/collisions to be avoided or prey to be confronted, whereas receding stimuli can signal successful escape or failed pursuit. Using movement detection and subjective ratings, we investigated the multisensory integration of looming and receding auditory and visual information by humans. While prior research has demonstrated a perceptual bias for unisensory and more recently multisensory looming stimuli, none has investigated whether there is integration of looming signals between modalities. Our findings reveal selective integration of multisensory looming stimuli. Performance was significantly enhanced for looming stimuli over all other multisensory conditions. Contrasts with static multisensory conditions indicate that only multisensory looming stimuli resulted in facilitation beyond that induced by the sheer presence of auditory-visual stimuli. Controlling for variation in physical energy replicated the advantage for multisensory looming stimuli. Finally, only looming stimuli exhibited a negative linear relationship between enhancement indices for detection speed and for subjective ratings. Maximal detection speed was attained when motion perception was already robust under unisensory conditions. The preferential integration of multisensory looming stimuli highlights that complex ethologically salient stimuli likely require synergistic cooperation between existing principles of multisensory integration. A new conceptualization of the neurophysiologic mechanisms mediating real-world multisensory perceptions and action is therefore supported.
Resumo:
Evidence of multisensory interactions within low-level cortices and at early post-stimulus latencies has prompted a paradigm shift in conceptualizations of sensory organization. However, the mechanisms of these interactions and their link to behavior remain largely unknown. One behaviorally salient stimulus is a rapidly approaching (looming) object, which can indicate potential threats. Based on findings from humans and nonhuman primates suggesting there to be selective multisensory (auditory-visual) integration of looming signals, we tested whether looming sounds would selectively modulate the excitability of visual cortex. We combined transcranial magnetic stimulation (TMS) over the occipital pole and psychophysics for "neurometric" and psychometric assays of changes in low-level visual cortex excitability (i.e., phosphene induction) and perception, respectively. Across three experiments we show that structured looming sounds considerably enhance visual cortex excitability relative to other sound categories and white-noise controls. The time course of this effect showed that modulation of visual cortex excitability started to differ between looming and stationary sounds for sound portions of very short duration (80 ms) that were significantly below (by 35 ms) perceptual discrimination threshold. Visual perceptions are thus rapidly and efficiently boosted by sounds through early, preperceptual and stimulus-selective modulation of neuronal excitability within low-level visual cortex.
Resumo:
Repetition of environmental sounds, like their visual counterparts, can facilitate behavior and modulate neural responses, exemplifying plasticity in how auditory objects are represented or accessed. It remains controversial whether such repetition priming/suppression involves solely plasticity based on acoustic features and/or also access to semantic features. To evaluate contributions of physical and semantic features in eliciting repetition-induced plasticity, the present functional magnetic resonance imaging (fMRI) study repeated either identical or different exemplars of the initially presented object; reasoning that identical exemplars share both physical and semantic features, whereas different exemplars share only semantic features. Participants performed a living/man-made categorization task while being scanned at 3T. Repeated stimuli of both types significantly facilitated reaction times versus initial presentations, demonstrating perceptual and semantic repetition priming. There was also repetition suppression of fMRI activity within overlapping temporal, premotor, and prefrontal regions of the auditory "what" pathway. Importantly, the magnitude of suppression effects was equivalent for both physically identical and semantically related exemplars. That the degree of repetition suppression was irrespective of whether or not both perceptual and semantic information was repeated is suggestive of a degree of acoustically independent semantic analysis in how object representations are maintained and retrieved.
Resumo:
Using head-mounted eye tracker material, we assessed spatial recognition abilities (e.g., reaction to object permutation, removal or replacement with a new object) in participants with intellectual disabilities. The "Intellectual Disabilities (ID)" group (n=40) obtained a score totalling a 93.7% success rate, whereas the "Normal Control" group (n=40) scored 55.6% and took longer to fix their attention on the displaced object. The participants with an intellectual disability thus had a more accurate perception of spatial changes than controls. Interestingly, the ID participants were more reactive to object displacement than to removal of the object. In the specific test of novelty detection, however, the scores were similar, the two groups approaching 100% detection. Analysis of the strategies expressed by the ID group revealed that they engaged in more systematic object checking and were more sensitive than the control group to changes in the structure of the environment. Indeed, during the familiarisation phase, the "ID" group explored the collection of objects more slowly, and fixed their gaze for a longer time upon a significantly lower number of fixation points during visual sweeping.
Resumo:
Action representations can interact with object recognition processes. For example, so-called mirror neurons respond both when performing an action and when seeing or hearing such actions. Investigations of auditory object processing have largely focused on categorical discrimination, which begins within the initial 100 ms post-stimulus onset and subsequently engages distinct cortical networks. Whether action representations themselves contribute to auditory object recognition and the precise kinds of actions recruiting the auditory-visual mirror neuron system remain poorly understood. We applied electrical neuroimaging analyses to auditory evoked potentials (AEPs) in response to sounds of man-made objects that were further subdivided between sounds conveying a socio-functional context and typically cuing a responsive action by the listener (e.g. a ringing telephone) and those that are not linked to such a context and do not typically elicit responsive actions (e.g. notes on a piano). This distinction was validated psychophysically by a separate cohort of listeners. Beginning approximately 300 ms, responses to such context-related sounds significantly differed from context-free sounds both in the strength and topography of the electric field. This latency is >200 ms subsequent to general categorical discrimination. Additionally, such topographic differences indicate that sounds of different action sub-types engage distinct configurations of intracranial generators. Statistical analysis of source estimations identified differential activity within premotor and inferior (pre)frontal regions (Brodmann's areas (BA) 6, BA8, and BA45/46/47) in response to sounds of actions typically cuing a responsive action. We discuss our results in terms of a spatio-temporal model of auditory object processing and the interplay between semantic and action representations.
Resumo:
Currently, individuals including designers, contractors, and owners learn about the project requirements by studying a combination of paper and electronic copies of the construction documents including the drawings, specifications (standard and supplemental), road and bridge standard drawings, design criteria, contracts, addenda, and change orders. This can be a tedious process since one needs to go back and forth between the various documents (paper or electronic) to obtain information about the entire project. Object-oriented computer-aided design (OO-CAD) is an innovative technology that can bring a change to this process by graphical portrayal of information. OO-CAD allows users to point and click on portions of an object-oriented drawing that are then linked to relevant databases of information (e.g., specifications, procurement status, and shop drawings). The vision of this study is to turn paper-based design standards and construction specifications into an object-oriented design and specification (OODAS) system or a visual electronic reference library (ERL). Individuals can use the system through a handheld wireless book-size laptop that includes all of the necessary software for operating in a 3D environment. All parties involved in transportation projects can access all of the standards and requirements simultaneously using a 3D graphical interface. By using this system, users will have all of the design elements and all of the specifications readily available without concerns of omissions. A prototype object-oriented model was created and demonstrated to potential users representing counties, cities, and the state. Findings suggest that a system like this could improve productivity to find information by as much as 75% and provide a greater sense of confidence that all relevant information had been identified. It was also apparent that this system would be used by more people in construction than in design. There was also concern related to the cost to develop and maintain the complete system. The future direction should focus on a project-based system that can help the contractors and DOT inspectors find information (e.g., road standards, specifications, instructional memorandums) more rapidly as it pertains to a specific project.
Resumo:
Multisensory processes facilitate perception of currently-presented stimuli and can likewise enhance later object recognition. Memories for objects originally encountered in a multisensory context can be more robust than those for objects encountered in an exclusively visual or auditory context [1], upturning the assumption that memory performance is best when encoding and recognition contexts remain constant [2]. Here, we used event-related potentials (ERPs) to provide the first evidence for direct links between multisensory brain activity at one point in time and subsequent object discrimination abilities. Across two experiments we found that individuals showing a benefit and those impaired during later object discrimination could be predicted by their brain responses to multisensory stimuli upon their initial encounter. These effects were observed despite the multisensory information being meaningless, task-irrelevant, and presented only once. We provide critical insights into the advantages associated with multisensory interactions; they are not limited to the processing of current stimuli, but likewise encompass the ability to determine the benefit of one's memories for object recognition in later, unisensory contexts.
Resumo:
In order to spare functional areas during the removal of brain tumours, electrical stimulation mapping was used in 90 patients (77 in the left hemisphere and 13 in the right; 2754 cortical sites tested). Language functions were studied with a special focus on comprehension of auditory and visual words and the semantic system. In addition to naming, patients were asked to perform pointing tasks from auditory and visual stimuli (using sets of 4 different images controlled for familiarity), and also auditory object (sound recognition) and Token test tasks. Ninety-two auditory comprehension interference sites were observed. We found that the process of auditory comprehension involved a few, fine-grained, sub-centimetre cortical territories. Early stages of speech comprehension seem to relate to two posterior regions in the left superior temporal gyrus. Downstream lexical-semantic speech processing and sound analysis involved 2 pathways, along the anterior part of the left superior temporal gyrus, and posteriorly around the supramarginal and middle temporal gyri. Electrostimulation experimentally dissociated perceptual consciousness attached to speech comprehension. The initial word discrimination process can be considered as an "automatic" stage, the attention feedback not being impaired by stimulation as would be the case at the lexical-semantic stage. Multimodal organization of the superior temporal gyrus was also detected since some neurones could be involved in comprehension of visual material and naming. These findings demonstrate a fine graded, sub-centimetre, cortical representation of speech comprehension processing mainly in the left superior temporal gyrus and are in line with those described in dual stream models of language comprehension processing.
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
One of the greatest conundrums to the contemporary science is the relation between consciousness and brain activity, and one of the specifi c questions is how neural activity can generate vivid subjective experiences. Studies focusing on visual consciousness have become essential in solving the empirical questions of consciousness. Th e main aim of this thesis is to clarify the relation between visual consciousness and the neural and electrophysiological processes of the brain. By applying electroencephalography and functional magnetic resonance image-guided transcranial magnetic stimulation (TMS), we investigated the links between conscious perception and attention, the temporal evolution of visual consciousness during stimulus processing, the causal roles of primary visual cortex (V1), visual area 2 (V2) and lateral occipital cortex (LO) in the generation of visual consciousness and also the methodological issues concerning the accuracy of targeting TMS to V1. Th e results showed that the fi rst eff ects of visual consciousness on electrophysiological responses (about 140 ms aft er the stimulus-onset) appeared earlier than the eff ects of selective attention, and also in the unattended condition, suggesting that visual consciousness and selective attention are two independent phenomena which have distinct underlying neural mechanisms. In addition, while it is well known that V1 is necessary for visual awareness, the results of the present thesis suggest that also the abutting visual area V2 is a prerequisite for conscious perception. In our studies, the activation in V2 was necessary for the conscious perception of change in contrast for a shorter period of time than in the case of more detailed conscious perception. We also found that TMS in LO suppressed the conscious perception of object shape when TMS was delivered in two distinct time windows, the latter corresponding with the timing of the ERPs related to the conscious perception of coherent object shape. Th e result supports the view that LO is crucial in conscious perception of object coherency and is likely to be directly involved in the generation of visual consciousness. Furthermore, we found that visual sensations, or phosphenes, elicited by the TMS of V1 were brighter than identically induced phosphenes arising from V2. Th ese fi ndings demonstrate that V1 contributes more to the generation of the sensation of brightness than does V2. Th e results also suggest that top-down activation from V2 to V1 is probably associated with phosphene generation. The results of the methodological study imply that when a commonly used landmark (2 cm above the inion) is used in targeting TMS to V1, the TMS-induced electric fi eld is likely to be highest in dorsal V2. When V1 was targeted according to the individual retinotopic data, the electric fi eld was highest in V1 only in half of the participants. Th is result suggests that if the objective is to study the role of V1 with TMS methodology, at least functional maps of V1 and V2 should be applied with computational model of the TMS-induced electric fi eld in V1 and V2. Finally, the results of this thesis imply that diff erent features of attention contribute diff erently to visual consciousness, and thus, the theoretical model which is built up of the relationship between visual consciousness and attention should acknowledge these diff erences. Future studies should also explore the possibility that visual consciousness consists of several processing stages, each of which have their distinct underlying neural mechanisms.
Resumo:
The aim of this paper is to study the role of verbal, visual and brand elements while meas-uring effectiveness of marketing message. The thesis is written in the context of mobile gaming industry. The object of the study is marketing message. To achieve the aim, the main research question was formulated: How do the elements of marketing message, such as verbal, visual and brand, affect the consumer’s attitude toward the ad, emotional response and attention capture? The theory development chapter lays on three corner stones – analysis of previous litera-ture on marketing message and its elements, namely verbal, visual and brand; overview of literature on attitude formation and particularly attitude toward the ad. In addition, investiga-tion of key points of emotional response and attention capture literature finalizes the chap-ter. The empirical part consists of experiment, conducted with 27 participants. Experiment includes the self-report semantically anchored scale, measuring the attitude toward the ad, as well as autonomic measures – eye tracking (attention capture) and facial expressions (emotional response). The results of the experiment showed that the size of the brand element – the logo – has an effect on the attention capture and the overall attitude toward the ad. The bigger the logo, the more time people spend viewing it, and they realise the message is more educa-tional and factual. The measure related to the visual element – the visual complexity – in-creases the intensity of participant’s facial expression. While the measure of verbal ele-ment – the contrast between text and background colours – leads to a better attitude to-ward the ad. The higher the contrast between text and background, the more known the message appears to the viewer.
Resumo:
Many industrial applications need object recognition and tracking capabilities. The algorithms developed for those purposes are computationally expensive. Yet ,real time performance, high accuracy and small power consumption are essential measures of the system. When all these requirements are combined, hardware acceleration of these algorithms becomes a feasible solution. The purpose of this study is to analyze the current state of these hardware acceleration solutions, which algorithms have been implemented in hardware and what modifications have been done in order to adapt these algorithms to hardware.