951 resultados para Visual Object Recognition
Resumo:
Increasing evidence suggests that working memory and perceptual processes are dynamically interrelated due to modulating activity in overlapping brain networks. However, the direct influence of working memory on the spatio-temporal brain dynamics of behaviorally relevant intervening information remains unclear. To investigate this issue, subjects performed a visual proximity grid perception task under three different visual-spatial working memory (VSWM) load conditions. VSWM load was manipulated by asking subjects to memorize the spatial locations of 6 or 3 disks. The grid was always presented between the encoding and recognition of the disk pattern. As a baseline condition, grid stimuli were presented without a VSWM context. VSWM load altered both perceptual performance and neural networks active during intervening grid encoding. Participants performed faster and more accurately on a challenging perceptual task under high VSWM load as compared to the low load and the baseline condition. Visual evoked potential (VEP) analyses identified changes in the configuration of the underlying sources in one particular period occurring 160-190 ms post-stimulus onset. Source analyses further showed an occipito-parietal down-regulation concurrent to the increased involvement of temporal and frontal resources in the high VSWM context. Together, these data suggest that cognitive control mechanisms supporting working memory may selectively enhance concurrent visual processing related to an independent goal. More broadly, our findings are in line with theoretical models implicating the engagement of frontal regions in synchronizing and optimizing mnemonic and perceptual resources towards multiple goals.
Resumo:
We describe a model-based objects recognition system which is part of an image interpretation system intended to assist autonomous vehicles navigation. The system is intended to operate in man-made environments. Behavior-based navigation of autonomous vehicles involves the recognition of navigable areas and the potential obstacles. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using CEES, the C++ embedded expert system shell developed in the Systems Engineering and Automatic Control Laboratory (University of Girona) as a specific rule-based problem solving tool. It has been especially conceived for supporting cooperative expert systems, and uses the object oriented programming paradigm
Resumo:
Hume's project concerning the conflict between liberty and necessity is ";reconciliatory";. But what is the nature of Hume's project? Does he solve a problem in metaphysics only? And when Hume says that the dispute between the doctrines of liberty and necessity is merely verbal, does he mean that there is no genuine metaphysical dispute between the doctrines? In the present essay I argue for: (1) there is room for liberty in Hume's philosophy, and not only because the position is pro forma compatibilist, even though this has importance for the recognition that Hume's main concern when discussing the matter is with practice; (2) the position does not involve a ";subjectivization"; of every form of necessity: it is not compatibilist because it creates a space for the claim that the operations of the will are non-problematically necessary through a weakning of the notion of necessity as it applies to external objects; (3) Hume holds that the ordinary phenomena of mental causation do not preempt the atribuition of moral responsibility, which combines perfectly with his identification of the object of moral evaluation: the whole of the character of a person, in relation to which there is, nonetheless, liberty. I intend to support my assertions by a close reading of what Hume states in section 8 of the first Enquiry.
Resumo:
During a possible loss of coolant accident in BWRs, a large amount of steam will be released from the reactor pressure vessel to the suppression pool. Steam will be condensed into the suppression pool causing dynamic and structural loads to the pool. The formation and break up of bubbles can be measured by visual observation using a suitable pattern recognition algorithm. The aim of this study was to improve the preliminary pattern recognition algorithm, developed by Vesa Tanskanen in his doctoral dissertation, by using MATLAB. Video material from the PPOOLEX test facility, recorded during thermal stratification and mixing experiments, was used as a reference in the development of the algorithm. The developed algorithm consists of two parts: the pattern recognition of the bubbles and the analysis of recognized bubble images. The bubble recognition works well, but some errors will appear due to the complex structure of the pool. The results of the image analysis were reasonable. The volume and the surface area of the bubbles were not evaluated. Chugging frequencies calculated by using FFT fitted well into the results of oscillation frequencies measured in the experiments. The pattern recognition algorithm works in the conditions it is designed for. If the measurement configuration will be changed, some modifications have to be done. Numerous improvements are proposed for the future 3D equipment.
Resumo:
The usage of digital content, such as video clips and images, has increased dramatically during the last decade. Local image features have been applied increasingly in various image and video retrieval applications. This thesis evaluates local features and applies them to image and video processing tasks. The results of the study show that 1) the performance of different local feature detector and descriptor methods vary significantly in object class matching, 2) local features can be applied in image alignment with superior results against the state-of-the-art, 3) the local feature based shot boundary detection method produces promising results, and 4) the local feature based hierarchical video summarization method shows promising new new research direction. In conclusion, this thesis presents the local features as a powerful tool in many applications and the imminent future work should concentrate on improving the quality of the local features.
Resumo:
This thesis researches automatic traffic sign inventory and condition analysis using machine vision and pattern recognition methods. Automatic traffic sign inventory and condition analysis can be used to more efficient road maintenance, improving the maintenance processes, and to enable intelligent driving systems. Automatic traffic sign detection and classification has been researched before from the viewpoint of self-driving vehicles, driver assistance systems, and the use of signs in mapping services. Machine vision based inventory of traffic signs consists of detection, classification, localization, and condition analysis of traffic signs. The produced machine vision system performance is estimated with three datasets, from which two of have been been collected for this thesis. Based on the experiments almost all traffic signs can be detected, classified, and located and their condition analysed. In future, the inventory system performance has to be verified in challenging conditions and the system has to be pilot tested.
Resumo:
One of the greatest conundrums to the contemporary science is the relation between consciousness and brain activity, and one of the specifi c questions is how neural activity can generate vivid subjective experiences. Studies focusing on visual consciousness have become essential in solving the empirical questions of consciousness. Th e main aim of this thesis is to clarify the relation between visual consciousness and the neural and electrophysiological processes of the brain. By applying electroencephalography and functional magnetic resonance image-guided transcranial magnetic stimulation (TMS), we investigated the links between conscious perception and attention, the temporal evolution of visual consciousness during stimulus processing, the causal roles of primary visual cortex (V1), visual area 2 (V2) and lateral occipital cortex (LO) in the generation of visual consciousness and also the methodological issues concerning the accuracy of targeting TMS to V1. Th e results showed that the fi rst eff ects of visual consciousness on electrophysiological responses (about 140 ms aft er the stimulus-onset) appeared earlier than the eff ects of selective attention, and also in the unattended condition, suggesting that visual consciousness and selective attention are two independent phenomena which have distinct underlying neural mechanisms. In addition, while it is well known that V1 is necessary for visual awareness, the results of the present thesis suggest that also the abutting visual area V2 is a prerequisite for conscious perception. In our studies, the activation in V2 was necessary for the conscious perception of change in contrast for a shorter period of time than in the case of more detailed conscious perception. We also found that TMS in LO suppressed the conscious perception of object shape when TMS was delivered in two distinct time windows, the latter corresponding with the timing of the ERPs related to the conscious perception of coherent object shape. Th e result supports the view that LO is crucial in conscious perception of object coherency and is likely to be directly involved in the generation of visual consciousness. Furthermore, we found that visual sensations, or phosphenes, elicited by the TMS of V1 were brighter than identically induced phosphenes arising from V2. Th ese fi ndings demonstrate that V1 contributes more to the generation of the sensation of brightness than does V2. Th e results also suggest that top-down activation from V2 to V1 is probably associated with phosphene generation. The results of the methodological study imply that when a commonly used landmark (2 cm above the inion) is used in targeting TMS to V1, the TMS-induced electric fi eld is likely to be highest in dorsal V2. When V1 was targeted according to the individual retinotopic data, the electric fi eld was highest in V1 only in half of the participants. Th is result suggests that if the objective is to study the role of V1 with TMS methodology, at least functional maps of V1 and V2 should be applied with computational model of the TMS-induced electric fi eld in V1 and V2. Finally, the results of this thesis imply that diff erent features of attention contribute diff erently to visual consciousness, and thus, the theoretical model which is built up of the relationship between visual consciousness and attention should acknowledge these diff erences. Future studies should also explore the possibility that visual consciousness consists of several processing stages, each of which have their distinct underlying neural mechanisms.
Resumo:
Human activity recognition in everyday environments is a critical, but challenging task in Ambient Intelligence applications to achieve proper Ambient Assisted Living, and key challenges still remain to be dealt with to realize robust methods. One of the major limitations of the Ambient Intelligence systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the speci c activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. A fuzzy ontology and a semantic hybrid system are presented to allow modelling and recognition of a set of complex real-life scenarios where vagueness and uncertainty are inherent to the human nature of the users that perform it. The handling of uncertain, incomplete and vague data (i.e., missing sensor readings and activity execution variations, since human behaviour is non-deterministic) is approached for the rst time through a fuzzy ontology validated on real-time settings within a hybrid data-driven and knowledgebased architecture. The semantics of activities, sub-activities and real-time object interaction are taken into consideration. The proposed framework consists of two main modules: the low-level sub-activity recognizer and the high-level activity recognizer. The rst module detects sub-activities (i.e., actions or basic activities) that take input data directly from a depth sensor (Kinect). The main contribution of this thesis tackles the second component of the hybrid system, which lays on top of the previous one, in a superior level of abstraction, and acquires the input data from the rst module's output, and executes ontological inference to provide users, activities and their in uence in the environment, with semantics. This component is thus knowledge-based, and a fuzzy ontology was designed to model the high-level activities. Since activity recognition requires context-awareness and the ability to discriminate among activities in di erent environments, the semantic framework allows for modelling common-sense knowledge in the form of a rule-based system that supports expressions close to natural language in the form of fuzzy linguistic labels. The framework advantages have been evaluated with a challenging and new public dataset, CAD-120, achieving an accuracy of 90.1% and 91.1% respectively for low and high-level activities. This entails an improvement over both, entirely data-driven approaches, and merely ontology-based approaches. As an added value, for the system to be su ciently simple and exible to be managed by non-expert users, and thus, facilitate the transfer of research to industry, a development framework composed by a programming toolbox, a hybrid crisp and fuzzy architecture, and graphical models to represent and con gure human behaviour in Smart Spaces, were developed in order to provide the framework with more usability in the nal application. As a result, human behaviour recognition can help assisting people with special needs such as in healthcare, independent elderly living, in remote rehabilitation monitoring, industrial process guideline control, and many other cases. This thesis shows use cases in these areas.
Resumo:
The problem of automatic recognition of the fish from the video sequences is discussed in this Master’s Thesis. This is a very urgent issue for many organizations engaged in fish farming in Finland and Russia because the process of automation control and counting of individual species is turning point in the industry. The difficulties and the specific features of the problem have been identified in order to find a solution and propose some recommendations for the components of the automated fish recognition system. Methods such as background subtraction, Kalman filtering and Viola-Jones method were implemented during this work for detection, tracking and estimation of fish parameters. Both the results of the experiments and the choice of the appropriate methods strongly depend on the quality and the type of a video which is used as an input data. Practical experiments have demonstrated that not all methods can produce good results for real data, whereas on synthetic data they operate satisfactorily.
Resumo:
Motivated by a recently proposed biologically inspired face recognition approach, we investigated the relation between human behavior and a computational model based on Fourier-Bessel (FB) spatial patterns. We measured human recognition performance of FB filtered face images using an 8-alternative forced-choice method. Test stimuli were generated by converting the images from the spatial to the FB domain, filtering the resulting coefficients with a band-pass filter, and finally taking the inverse FB transformation of the filtered coefficients. The performance of the computational models was tested using a simulation of the psychophysical experiment. In the FB model, face images were first filtered by simulated V1- type neurons and later analyzed globally for their content of FB components. In general, there was a higher human contrast sensitivity to radially than to angularly filtered images, but both functions peaked at the 11.3-16 frequency interval. The FB-based model presented similar behavior with regard to peak position and relative sensitivity, but had a wider frequency band width and a narrower response range. The response pattern of two alternative models, based on local FB analysis and on raw luminance, strongly diverged from the human behavior patterns. These results suggest that human performance can be constrained by the type of information conveyed by polar patterns, and consequently that humans might use FB-like spatial patterns in face processing.
Resumo:
The aim of this paper is to study the role of verbal, visual and brand elements while meas-uring effectiveness of marketing message. The thesis is written in the context of mobile gaming industry. The object of the study is marketing message. To achieve the aim, the main research question was formulated: How do the elements of marketing message, such as verbal, visual and brand, affect the consumer’s attitude toward the ad, emotional response and attention capture? The theory development chapter lays on three corner stones – analysis of previous litera-ture on marketing message and its elements, namely verbal, visual and brand; overview of literature on attitude formation and particularly attitude toward the ad. In addition, investiga-tion of key points of emotional response and attention capture literature finalizes the chap-ter. The empirical part consists of experiment, conducted with 27 participants. Experiment includes the self-report semantically anchored scale, measuring the attitude toward the ad, as well as autonomic measures – eye tracking (attention capture) and facial expressions (emotional response). The results of the experiment showed that the size of the brand element – the logo – has an effect on the attention capture and the overall attitude toward the ad. The bigger the logo, the more time people spend viewing it, and they realise the message is more educa-tional and factual. The measure related to the visual element – the visual complexity – in-creases the intensity of participant’s facial expression. While the measure of verbal ele-ment – the contrast between text and background colours – leads to a better attitude to-ward the ad. The higher the contrast between text and background, the more known the message appears to the viewer.
Resumo:
Forty students from regular, grade five classes were divided into two groups of twenty, a good reader group and a' poor reader group, on the basis. of their reading scores on Canadian Achievement Tests. .The subjects took. part in four experimental conditions iM which they .learned lists of pronounceable and unprono~nceable pseudowords, some with semantic referents, and responded to questions designed tci test visual perceptu~l learning and lexical ·and semantic association learning. It' was hypothesized "that the good reade~ group would be able to make use of graphemic and phonemic redundancy patterns in order to improv~·visuSl perceptual learning and lexical and semantic association lea~ningto a greater extent. than would .the poor reader gr6up. The data supported this hypothesis, and also indicated that, although the poor readers were less adept at using familiar sound and letter patterns, they were more dependent on· such pa~terns as an aid to visual recognition memory and semantic recall than were the good readers. It wa.s postulated that poor readers are in a double- ~ . bind situatio~ of having to choose between using weak graphemic-semantic associations or gr~pheme-phoneme associations which are also weak and which have hindered them in developing automaticity in. reading.
Resumo:
Genetic Programming (GP) is a widely used methodology for solving various computational problems. GP's problem solving ability is usually hindered by its long execution times. In this thesis, GP is applied toward real-time computer vision. In particular, object classification and tracking using a parallel GP system is discussed. First, a study of suitable GP languages for object classification is presented. Two main GP approaches for visual pattern classification, namely the block-classifiers and the pixel-classifiers, were studied. Results showed that the pixel-classifiers generally performed better. Using these results, a suitable language was selected for the real-time implementation. Synthetic video data was used in the experiments. The goal of the experiments was to evolve a unique classifier for each texture pattern that existed in the video. The experiments revealed that the system was capable of correctly tracking the textures in the video. The performance of the system was on-par with real-time requirements.
Resumo:
Les troubles du spectre autistique (TSA) sont actuellement caractérisés par une triade d'altérations, incluant un dysfonctionnement social, des déficits de communication et des comportements répétitifs. L'intégration simultanée de multiples sens est cruciale dans la vie quotidienne puisqu'elle permet la création d'un percept unifié. De façon similaire, l'allocation d'attention à de multiples stimuli simultanés est critique pour le traitement de l'information environnementale dynamique. Dans l'interaction quotidienne avec l'environnement, le traitement sensoriel et les fonctions attentionnelles sont des composantes de base dans le développement typique (DT). Bien qu'ils ne fassent pas partie des critères diagnostiques actuels, les difficultés dans les fonctions attentionnelles et le traitement sensoriel sont très courants parmi les personnes autistes. Pour cela, la présente thèse évalue ces fonctions dans deux études séparées. La première étude est fondée sur la prémisse que des altérations dans le traitement sensoriel de base pourraient être à l'origine des comportements sensoriels atypiques chez les TSA, tel que proposé par des théories actuelles des TSA. Nous avons conçu une tâche de discrimination de taille intermodale, afin d'investiguer l'intégrité et la trajectoire développementale de l'information visuo-tactile chez les enfants avec un TSA (N = 21, âgés de 6 à18 ans), en comparaison à des enfants à DT, appariés sur l’âge et le QI de performance. Dans une tâche à choix forcé à deux alternatives simultanées, les participants devaient émettre un jugement sur la taille de deux stimuli, basé sur des inputs unisensoriels (visuels ou tactiles) ou multisensoriels (visuo-tactiles). Des seuils différentiels ont évalué la plus petite différence à laquelle les participants ont été capables de faire la discrimination de taille. Les enfants avec un TSA ont montré une performance diminuée et pas d'effet de maturation aussi bien dans les conditions unisensorielles que multisensorielles, comparativement aux participants à DT. Notre première étude étend donc des résultats précédents d'altérations dans le traitement multisensoriel chez les TSA au domaine visuo-tactile. Dans notre deuxième étude, nous avions évalué les capacités de poursuite multiple d’objets dans l’espace (3D-Multiple Object Tracking (3D-MOT)) chez des adultes autistes (N = 15, âgés de 18 à 33 ans), comparés à des participants contrôles appariés sur l'âge et le QI, qui devaient suivre une ou trois cibles en mouvement parmi des distracteurs dans un environnement de réalité virtuelle. Les performances ont été mesurées par des seuils de vitesse, qui évaluent la plus grande vitesse à laquelle des observateurs sont capables de suivre des objets en mouvement. Les individus autistes ont montré des seuils de vitesse réduits dans l'ensemble, peu importe le nombre d'objets à suivre. Ces résultats étendent des résultats antérieurs d'altérations au niveau des mécanismes d'attention en autisme quant à l'allocation simultanée de l'attention envers des endroits multiples. Pris ensemble, les résultats de nos deux études révèlent donc des altérations chez les TSA quant au traitement simultané d'événements multiples, que ce soit dans une modalité ou à travers des modalités, ce qui peut avoir des implications importantes au niveau de la présentation clinique de cette condition.
Resumo:
Pedicle screw insertion technique has made revolution in the surgical treatment of spinal fractures and spinal disorders. Although X- ray fluoroscopy based navigation is popular, there is risk of prolonged exposure to X- ray radiation. Systems that have lower radiation risk are generally quite expensive. The position and orientation of the drill is clinically very important in pedicle screw fixation. In this paper, the position and orientation of the marker on the drill is determined using pattern recognition based methods, using geometric features, obtained from the input video sequence taken from CCD camera. A search is then performed on the video frames after preprocessing, to obtain the exact position and orientation of the drill. An animated graphics, showing the instantaneous position and orientation of the drill is then overlaid on the processed video for real time drill control and navigation