993 resultados para Visual Recognition
Resumo:
The ability of integrating into a unified percept sensory inputs deriving from different sensory modalities, but related to the same external event, is called multisensory integration and might represent an efficient mechanism of sensory compensation when a sensory modality is damaged by a cortical lesion. This hypothesis has been discussed in the present dissertation. Experiment 1 explored the role of superior colliculus (SC) in multisensory integration, testing patients with collicular lesions, patients with subcortical lesions not involving the SC and healthy control subjects in a multisensory task. The results revealed that patients with collicular lesions, paralleling the evidence of animal studies, demonstrated a loss of multisensory enhancement, in contrast with control subjects, providing the first lesional evidence in humans of the essential role of SC in mediating audio-visual integration. Experiment 2 investigated the role of cortex in mediating multisensory integrative effects, inducing virtual lesions by inhibitory theta-burst stimulation on temporo-parietal cortex, occipital cortex and posterior parietal cortex, demonstrating that only temporo-parietal cortex was causally involved in modulating the integration of audio-visual stimuli at the same spatial location. Given the involvement of the retino-colliculo-extrastriate pathway in mediating audio-visual integration, the functional sparing of this circuit in hemianopic patients is extremely relevant in the perspective of a multisensory-based approach to the recovery of unisensory defects. Experiment 3 demonstrated the spared functional activity of this circuit in a group of hemianopic patients, revealing the presence of implicit recognition of the fearful content of unseen visual stimuli (i.e. affective blindsight), an ability mediated by the retino-colliculo-extrastriate pathway and its connections with amygdala. Finally, Experiment 4 provided evidence that a systematic audio-visual stimulation is effective in inducing long-lasting clinical improvements in patients with visual field defect and revealed that the activity of the spared retino-colliculo-extrastriate pathway is responsible of the observed clinical amelioration, as suggested by the greater improvement observed in patients with cortical lesions limited to the occipital cortex, compared to patients with lesions extending to other cortical areas, found in tasks high demanding in terms of spatial orienting. Overall, the present results indicated that multisensory integration is mediated by the retino-colliculo-extrastriate pathway and that a systematic audio-visual stimulation, activating this spared neural circuit, is able to affect orientation towards the blind field in hemianopic patients and, therefore, might constitute an effective and innovative approach for the rehabilitation of unisensory visual impairments.
Resumo:
Visual tracking is the problem of estimating some variables related to a target given a video sequence depicting the target. Visual tracking is key to the automation of many tasks, such as visual surveillance, robot or vehicle autonomous navigation, automatic video indexing in multimedia databases. Despite many years of research, long term tracking in real world scenarios for generic targets is still unaccomplished. The main contribution of this thesis is the definition of effective algorithms that can foster a general solution to visual tracking by letting the tracker adapt to mutating working conditions. In particular, we propose to adapt two crucial components of visual trackers: the transition model and the appearance model. The less general but widespread case of tracking from a static camera is also considered and a novel change detection algorithm robust to sudden illumination changes is proposed. Based on this, a principled adaptive framework to model the interaction between Bayesian change detection and recursive Bayesian trackers is introduced. Finally, the problem of automatic tracker initialization is considered. In particular, a novel solution for categorization of 3D data is presented. The novel category recognition algorithm is based on a novel 3D descriptors that is shown to achieve state of the art performances in several applications of surface matching.
Resumo:
Lesions to the primary geniculo-striate visual pathway cause blindness in the contralesional visual field. Nevertheless, previous studies have suggested that patients with visual field defects may still be able to implicitly process the affective valence of unseen emotional stimuli (affective blindsight) through alternative visual pathways bypassing the striate cortex. These alternative pathways may also allow exploitation of multisensory (audio-visual) integration mechanisms, such that auditory stimulation can enhance visual detection of stimuli which would otherwise be undetected when presented alone (crossmodal blindsight). The present dissertation investigated implicit emotional processing and multisensory integration when conscious visual processing is prevented by real or virtual lesions to the geniculo-striate pathway, in order to further clarify both the nature of these residual processes and the functional aspects of the underlying neural pathways. The present experimental evidence demonstrates that alternative subcortical visual pathways allow implicit processing of the emotional content of facial expressions in the absence of cortical processing. However, this residual ability is limited to fearful expressions. This finding suggests the existence of a subcortical system specialised in detecting danger signals based on coarse visual cues, therefore allowing the early recruitment of flight-or-fight behavioural responses even before conscious and detailed recognition of potential threats can take place. Moreover, the present dissertation extends the knowledge about crossmodal blindsight phenomena by showing that, unlike with visual detection, sound cannot crossmodally enhance visual orientation discrimination in the absence of functional striate cortex. This finding demonstrates, on the one hand, that the striate cortex plays a causative role in crossmodally enhancing visual orientation sensitivity and, on the other hand, that subcortical visual pathways bypassing the striate cortex, despite affording audio-visual integration processes leading to the improvement of simple visual abilities such as detection, cannot mediate multisensory enhancement of more complex visual functions, such as orientation discrimination.
Resumo:
Generic object recognition is an important function of the human visual system and everybody finds it highly useful in their everyday life. For an artificial vision system it is a really hard, complex and challenging task because instances of the same object category can generate very different images, depending of different variables such as illumination conditions, the pose of an object, the viewpoint of the camera, partial occlusions, and unrelated background clutter. The purpose of this thesis is to develop a system that is able to classify objects in 2D images based on the context, and identify to which category the object belongs to. Given an image, the system can classify it and decide the correct categorie of the object. Furthermore the objective of this thesis is also to test the performance and the precision of different supervised Machine Learning algorithms in this specific task of object image categorization. Through different experiments the implemented application reveals good categorization performances despite the difficulty of the problem. However this project is open to future improvement; it is possible to implement new algorithms that has not been invented yet or using other techniques to extract features to make the system more reliable. This application can be installed inside an embedded system and after trained (performed outside the system), so it can become able to classify objects in a real-time. The information given from a 3D stereocamera, developed inside the department of Computer Engineering of the University of Bologna, can be used to improve the accuracy of the classification task. The idea is to segment a single object in a scene using the depth given from a stereocamera and in this way make the classification more accurate.
Resumo:
Research into visual hallucinations has accelerated over the last decade from around 350 publications per year in 2000 to over 500 in 2010. Increased recognition of the frequent occurrence of visual hallucinations in a number of common disorders, coupled with improvements in the measurement of phenomenology, and more sophisticated imaging techniques have allowed the development and initial testing of sophisticated models. However, key questions remain unanswered. Amongst these are: whether there is a satisfactory definition of hallucinations in a constructive visual system; whether there are one, two or several core varieties of hallucinations; what are the underlying brain mechanisms for hallucinations; and what, if anything, can be done to treat them when they lead to distress? Looking across research in several clinical areas suggests a tentative integrative model that allows the possibility of answering these questions, but much work remains to be done.
Resumo:
1 Natural soil profiles may be interpreted as an arrangement of parts which are characterized by properties like hydraulic conductivity and water retention function. These parts form a complicated structure. Characterizing the soil structure is fundamental in subsurface hydrology because it has a crucial influence on flow and transport and defines the patterns of many ecological processes. We applied an image analysis method for recognition and classification of visual soil attributes in order to model flow and transport through a man-made soil profile. Modeled and measured saturation-dependent effective parameters were compared. We found that characterizing and describing conductivity patterns in soils with sharp conductivity contrasts is feasible. Differently, solving flow and transport on the basis of these conductivity maps is difficult and, in general, requires special care for representation of small-scale processes.
Resumo:
Effective visual exploration is required for many activities of daily living and instruments to assess visual exploration are important for the evaluation of the visual and the oculomotor system. In this article, the development of a new instrument to measure central and peripheral target recognition is described. The measurement setup consists of a hemispherical projection which allows presenting images over a large area of ±90° horizontal and vertical angle. In a feasibility study with 14 younger (21–49 years) and 12 older (50–78 years) test persons, 132 targets and 24 distractors were presented within naturalistic color photographs of everyday scenes at 10°, 30°, and 50° eccentricity. After the experiment, both younger and older participants reported in a questionnaire that the task is easy to understand, fun and that it measures a competence that is relevant for activities of daily living. A main result of the pilot study was that younger participants recognized more targets with smaller reaction times than older participants. The group differences were most pronounced for peripheral target detection. This test is feasible and appropriate to assess the functional field of view in younger and older adults.
Resumo:
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the Bag of Features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5,000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10,000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.
Resumo:
BACKGROUND: Higher visual functions can be defined as cognitive processes responsible for object recognition, color and shape perception, and motion detection. People with impaired higher visual functions after unilateral brain lesion are often tested with paper pencil tests, but such tests do not assess the degree of interaction between the healthy brain hemisphere and the impaired one. Hence, visual functions are not tested separately in the contralesional and ipsilesional visual hemifields. METHODS: A new measurement setup, that involves real-time comparisons of shape and size of objects, orientation of lines, speed and direction of moving patterns, in the right or left visual hemifield, has been developed. The setup was implemented in an immersive environment like a hemisphere to take into account the effects of peripheral and central vision, and eventual visual field losses. Due to the non-flat screen of the hemisphere, a distortion algorithm was needed to adapt the projected images to the surface. Several approaches were studied and, based on a comparison between projected images and original ones, the best one was used for the implementation of the test. Fifty-seven healthy volunteers were then tested in a pilot study. A Satisfaction Questionnaire was used to assess the usability of the new measurement setup. RESULTS: The results of the distortion algorithm showed a structural similarity between the warped images and the original ones higher than 97%. The results of the pilot study showed an accuracy in comparing images in the two visual hemifields of 0.18 visual degrees and 0.19 visual degrees for size and shape discrimination, respectively, 2.56° for line orientation, 0.33 visual degrees/s for speed perception and 7.41° for recognition of motion direction. The outcome of the Satisfaction Questionnaire showed a high acceptance of the battery by the participants. CONCLUSIONS: A new method to measure higher visual functions in an immersive environment was presented. The study focused on the usability of the developed battery rather than the performance at the visual tasks. A battery of five subtasks to study the perception of size, shape, orientation, speed and motion direction was developed. The test setup is now ready to be tested in neurological patients.
Resumo:
Three rhesus monkeys (Macaca mulatta) and four pigeons (Columba livia) were trained in a visual serial probe recognition (SPR) task. A list of visual stimuli (slides) was presented sequentially to the subjects. Following the list and after a delay interval, a probe stimulus was presented that could be either from the list (Same) or not from the list (Different). The monkeys readily acquired a variable list length SPR task, while pigeons showed acquisition only under constant list length condition. However, monkeys memorized the responses to the probes (absolute strategy) when overtrained with the same lists and probes, while pigeons compared the probe to the list in memory (relational strategy). Performance of the pigeon on 4-items constant list length was disrupted when blocks of trials of different list lengths were imbedded between the 4-items blocks. Serial position curves for recognition at variable probe delays showed better relative performance on the last items of the list at short delays (0-0.5 seconds) and better relative performance on the initial items of the list at long delays (6-10 seconds for the pigeons and 20-30 seconds for the monkeys and a human adolescent). The serial position curves also showed reliable primacy and recency effects at intermediate probe delays. The monkeys showed evidence of using a relational strategy in the variable probe delay task. The results are the first demonstration of relational serial probe recognition performance in an avian and suggest similar underlying dynamic recognition memory mechanisms in primates and avians. ^
Resumo:
Adult monkeys (Macaca mulatta) with lesions of the hippocampal formation, perirhinal cortex, areas TH/TF, as well as controls were tested on tasks of object, spatial and contextual recognition memory. ^ Using a visual paired-comparison (VPC) task, all experimental groups showed a lack of object recognition relative to controls, although this impairment emerged at 10 sec with perirhinal lesions, 30 sec with areas TH/TF lesions and 60 sec with hippocampal lesions. In contrast, only perirhinal lesions impaired performance on delayed nonmatching-to-sample (DNMS), another task of object recognition memory. All groups were tested on DNMS with distraction (dDNMS) to examine whether the use of active cognitive strategies during the delay period could enable good performance on DNMS in spite of impaired recognition memory (revealed by the VPC task). Distractors affected performance of animals with perirhinal lesions at the 10-sec delay (the only delay in which their DNMS performance was above chance). They did not affect performance of animals with areas TH/TF lesions. Hippocampectomized animals were impaired at the 600-sec delay (the only delay at which prevention of active strategies would likely affect their behavior). ^ While lesions of areas TH/TF impaired spatial location memory and object-in-place memory, hippocampal lesions impaired only object-in-place memory. The pattern of results for perirhinal cortex lesions on the different task conditions indicated that this cortical area is not critical for spatial memory. ^ Finally, all three lesions impaired contextual recognition memory processes. The pattern of impairment appeared to result from the formation of only a global representation of the object and background, and suggests that all three areas are recruited for associating information across sources. ^ These results support the view that (1) the perirhinal cortex maintains storage of information about object and the context in which it is learned for a brief period of time, (2) areas TH/TF maintain information about spatial location and form associations between objects and their spatial relationship (a process that likely requires additional time) and (3) the hippocampal formation mediates associations between objects, their spatial relationship and the general context in which these associations are formed (an integrative function that requires additional time). ^
Resumo:
In this paper, the fusion of probabilistic knowledge-based classification rules and learning automata theory is proposed and as a result we present a set of probabilistic classification rules with self-learning capability. The probabilities of the classification rules change dynamically guided by a supervised reinforcement process aimed at obtaining an optimum classification accuracy. This novel classifier is applied to the automatic recognition of digital images corresponding to visual landmarks for the autonomous navigation of an unmanned aerial vehicle (UAV) developed by the authors. The classification accuracy of the proposed classifier and its comparison with well-established pattern recognition methods is finally reported.
Resumo:
The aim of this Master Thesis is the analysis, design and development of a robust and reliable Human-Computer Interaction interface, based on visual hand-gesture recognition. The implementation of the required functions is oriented to the simulation of a classical hardware interaction device: the mouse, by recognizing a specific hand-gesture vocabulary in color video sequences. For this purpose, a prototype of a hand-gesture recognition system has been designed and implemented, which is composed of three stages: detection, tracking and recognition. This system is based on machine learning methods and pattern recognition techniques, which have been integrated together with other image processing approaches to get a high recognition accuracy and a low computational cost. Regarding pattern recongition techniques, several algorithms and strategies have been designed and implemented, which are applicable to color images and video sequences. The design of these algorithms has the purpose of extracting spatial and spatio-temporal features from static and dynamic hand gestures, in order to identify them in a robust and reliable way. Finally, a visual database containing the necessary vocabulary of gestures for interacting with the computer has been created.
Resumo:
Long-term visual memory performance was impaired by two types of challenges: a diazepam challenge on acquisition and a sensory challenge on recognition. Using positron-emission tomography regional cerebral blood flow imaging, we studied the effect of these challenges on regional brain activation during the delayed recognition of abstract visual shapes as compared with a baseline fixation task. Both challenges induced a significant decrease in differential activation in the left fusiform gyrus, suggesting that this region is involved in the automatic or volitional comparison of incoming and stored stimuli. In contrast, thalamic differential activation increased in response to memory challenges. This increase might reflect enhanced retrieval attempts as a compensatory mechanism for restoring recognition performance.
Resumo:
In optimal foraging theory, search time is a key variable defining the value of a prey type. But the sensory-perceptual processes that constrain the search for food have rarely been considered. Here we evaluate the flight behavior of bumblebees (Bombus terrestris) searching for artificial flowers of various sizes and colors. When flowers were large, search times correlated well with the color contrast of the targets with their green foliage-type background, as predicted by a model of color opponent coding using inputs from the bees' UV, blue, and green receptors. Targets that made poor color contrast with their backdrop, such as white, UV-reflecting ones, or red flowers, took longest to detect, even though brightness contrast with the background was pronounced. When searching for small targets, bees changed their strategy in several ways. They flew significantly slower and closer to the ground, so increasing the minimum detectable area subtended by an object on the ground. In addition, they used a different neuronal channel for flower detection. Instead of color contrast, they used only the green receptor signal for detection. We relate these findings to temporal and spatial limitations of different neuronal channels involved in stimulus detection and recognition. Thus, foraging speed may not be limited only by factors such as prey density, flight energetics, and scramble competition. Our results show that understanding the behavioral ecology of foraging can substantially gain from knowledge about mechanisms of visual information processing.