1000 resultados para visual categorization


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The number of digital images has been increasing exponentially in the last few years. People have problems managing their image collections and finding a specific image. An automatic image categorization system could help them to manage images and find specific images. In this thesis, an unsupervised visual object categorization system was implemented to categorize a set of unknown images. The system is unsupervised, and hence, it does not need known images to train the system which needs to be manually obtained. Therefore, the number of possible categories and images can be huge. The system implemented in the thesis extracts local features from the images. These local features are used to build a codebook. The local features and the codebook are then used to generate a feature vector for an image. Images are categorized based on the feature vectors. The system is able to categorize any given set of images based on the visual appearance of the images. Images that have similar image regions are grouped together in the same category. Thus, for example, images which contain cars are assigned to the same cluster. The unsupervised visual object categorization system can be used in many situations, e.g., in an Internet search engine. The system can categorize images for a user, and the user can then easily find a specific type of image.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The repeated presentation of simple objects as well as biologically salient objects can cause the adaptation of behavioral and neural responses during the visual categorization of these objects. Mechanisms of response adaptation during repeated food viewing are of particular interest for better understanding food intake beyond energetic needs. Here, we measured visual evoked potentials (VEPs) and conducted neural source estimations to initial and repeated presentations of high-energy and low-energy foods as well as non-food images. The results of our study show that the behavioral and neural responses to food and food-related objects are not uniformly affected by repetition. While the repetition of images displaying low-energy foods and non-food modulated VEPs as well as their underlying neural sources and increased behavioral categorization accuracy, the responses to high-energy images remained largely invariant between initial and repeated encounters. Brain mechanisms when viewing images of high-energy foods thus appear less susceptible to repetition effects than responses to low-energy and non-food images. This finding is likely related to the superior reward value of high-energy foods and might be one reason why in particular high-energetic foods are indulged although potentially leading to detrimental health consequences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background The information processing capacity of the human mind is limited, as is evidenced by the attentional blink (AB) - a deficit in identifying the second of two temporally-close targets (T1 and T2) embedded in a rapid stream of distracters. Theories of the AB generally agree that it results from competition between stimuli for conscious representation. However, they disagree in the specific mechanisms, in particular about how attentional processing of T1 determines the AB to T2. Methodology/Principal Findings The present study used the high spatial resolution of functional magnetic resonance imaging (fMRI) to examine the neural mechanisms underlying the AB. Our research approach was to design T1 and T2 stimuli that activate distinguishable brain areas involved in visual categorization and representation. ROI and functional connectivity analyses were then used to examine how attentional processing of T1, as indexed by activity in the T1 representation area, affected T2 processing. Our main finding was that attentional processing of T1 at the level of the visual cortex predicted T2 detection rates Those individuals who activated the T1 encoding area more strongly in blink versus no-blink trials generally detected T2 on a lower percentage of trials. The coupling of activity between T1 and T2 representation areas did not vary as a function of conscious T2 perception. Conclusions/Significance These data are consistent with the notion that the AB is related to attentional demands of T1 for selection, and indicate that these demands are reflected at the level of visual cortex. They also highlight the importance of individual differences in attentional settings in explaining AB task performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To recognize a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character both of natural kinds and of artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Different from the first attempts to solve the image categorization problem (often based on global features), recently, several researchers have been tackling this research branch through a new vantage point - using features around locally invariant interest points and visual dictionaries. Although several advances have been done in the visual dictionaries literature in the past few years, a problem we still need to cope with is calculation of the number of representative words in the dictionary. Therefore, in this paper we introduce a new solution for automatically finding the number of visual words in an N-Way image categorization problem by means of supervised pattern classification based on optimum-path forest. © 2011 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation presented at the Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer Engineering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The stylistic categorization of the Estado Novo has been intensely discussed by Portuguese art historians. The square Alameda Dom Afonso Henriques in Lisbon (Alameda) can be seen as paradigmatic for the architecture of power of the Estado Novo. The Alameda forms a gardened valley between two hills. There you find two prominent and highly propagandist buildings: The Instituto Superior Técnico (IST) and the Fonte Luminosa are dedicated to modern sciences and respectively to the harmonious contribution of nature to the city. The iconography of the Alameda as well as its incorporation into the propagandist use of urban planning in the 1930s and 1940s exemplify the visual politics during Salazarism. Urban planning programs intended to create cities that would preserve the character of a traditional catholic society and at the same time answer to the need to modernize the country and evoke the image of a progressive state. Thus, public buildings and urban squares such as the Alameda contributed to design a corporate image and to the ‘spirit’ of the regime.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Please consult the paper edition of this thesis to read. It is available on the 5th Floor of the Library at Call Number: Z 9999.5 E38 L64 2008

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual tracking is the problem of estimating some variables related to a target given a video sequence depicting the target. Visual tracking is key to the automation of many tasks, such as visual surveillance, robot or vehicle autonomous navigation, automatic video indexing in multimedia databases. Despite many years of research, long term tracking in real world scenarios for generic targets is still unaccomplished. The main contribution of this thesis is the definition of effective algorithms that can foster a general solution to visual tracking by letting the tracker adapt to mutating working conditions. In particular, we propose to adapt two crucial components of visual trackers: the transition model and the appearance model. The less general but widespread case of tracking from a static camera is also considered and a novel change detection algorithm robust to sudden illumination changes is proposed. Based on this, a principled adaptive framework to model the interaction between Bayesian change detection and recursive Bayesian trackers is introduced. Finally, the problem of automatic tracker initialization is considered. In particular, a novel solution for categorization of 3D data is presented. The novel category recognition algorithm is based on a novel 3D descriptors that is shown to achieve state of the art performances in several applications of surface matching.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Generic object recognition is an important function of the human visual system and everybody finds it highly useful in their everyday life. For an artificial vision system it is a really hard, complex and challenging task because instances of the same object category can generate very different images, depending of different variables such as illumination conditions, the pose of an object, the viewpoint of the camera, partial occlusions, and unrelated background clutter. The purpose of this thesis is to develop a system that is able to classify objects in 2D images based on the context, and identify to which category the object belongs to. Given an image, the system can classify it and decide the correct categorie of the object. Furthermore the objective of this thesis is also to test the performance and the precision of different supervised Machine Learning algorithms in this specific task of object image categorization. Through different experiments the implemented application reveals good categorization performances despite the difficulty of the problem. However this project is open to future improvement; it is possible to implement new algorithms that has not been invented yet or using other techniques to extract features to make the system more reliable. This application can be installed inside an embedded system and after trained (performed outside the system), so it can become able to classify objects in a real-time. The information given from a 3D stereocamera, developed inside the department of Computer Engineering of the University of Bologna, can be used to improve the accuracy of the classification task. The idea is to segment a single object in a scene using the depth given from a stereocamera and in this way make the classification more accurate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hemispheric differences in the learning and generalization of pattern categories were explored in two experiments involving sixteen patients with unilateral posterior, cerebral lesions in the left (LH) or right (RH) hemisphere. In each experiment participants were first trained to criterion in a supervised learning paradigm to categorize a set of patterns that either consisted of simple geometric forms (Experiment 1) or unfamiliar grey-level images (Experiment 2). They were then tested for their ability to generalize acquired categorical knowledge to contrast-reversed versions of the learning patterns. The results showed that RH lesions impeded category learning of unfamiliar grey-level images more severely than LH lesions, whereas this relationship appeared reversed for categories defined by simple geometric forms. With regard to generalization to contrast reversal, categorization performance of LH and RH patients was unaffected in the case of simple geometric forms. However, generalization to of contrast-reversed grey-level images distinctly deteriorated for patients with LH lesions relative to those with RH lesions, with the latter (but not the former) being consistently unable to identify the pattern manipulation. These findings suggest a differential use of contrast information in the representation of pattern categories in the two hemispheres. Such specialization appears in line with previous distinctions between a predominantly lefthemispheric, abstract-analytical and a righthemispheric, specific-holistic representation of object categories, and their prediction of a mandatory representation of contrast polarity in the RH. Some implications for the well-established dissociation of visual disorders for the recognition of faces and letters are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current research examined the influence of ingroup/outgroup categorization on brain event-related potentials measured during perceptual processing of own- and other-race faces. White participants performed a sequential matching task with upright and inverted faces belonging either to their own race (White) or to another race (Black) and affiliated with either their own university or another university by a preceding visual prime. Results demonstrated that the right-lateralized N170 component evoked by test faces was modulated by race and by social category: the N170 to own-race faces showed a larger inversion effect (i.e., latency delay for inverted faces) when the faces were categorized as other-university rather than own-university members; the N170 to other-race faces showed no modulation of its inversion effect by university affiliation. These results suggest that neural correlates of structural face encoding (as evidenced by the N170 inversion effects) can be modulated by both visual (racial) and nonvisual (social) ingroup/outgroup status. © 2014 © 2014 Taylor & Francis.