907 resultados para object categorization


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The number of digital images has been increasing exponentially in the last few years. People have problems managing their image collections and finding a specific image. An automatic image categorization system could help them to manage images and find specific images. In this thesis, an unsupervised visual object categorization system was implemented to categorize a set of unknown images. The system is unsupervised, and hence, it does not need known images to train the system which needs to be manually obtained. Therefore, the number of possible categories and images can be huge. The system implemented in the thesis extracts local features from the images. These local features are used to build a codebook. The local features and the codebook are then used to generate a feature vector for an image. Images are categorized based on the feature vectors. The system is able to categorize any given set of images based on the visual appearance of the images. Images that have similar image regions are grouped together in the same category. Thus, for example, images which contain cars are assigned to the same cluster. The unsupervised visual object categorization system can be used in many situations, e.g., in an Internet search engine. The system can categorize images for a user, and the user can then easily find a specific type of image.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Local features are used in many computer vision tasks including visual object categorization, content-based image retrieval and object recognition to mention a few. Local features are points, blobs or regions in images that are extracted using a local feature detector. To make use of extracted local features the localized interest points are described using a local feature descriptor. A descriptor histogram vector is a compact representation of an image and can be used for searching and matching images in databases. In this thesis the performance of local feature detectors and descriptors is evaluated for object class detection task. Features are extracted from image samples belonging to several object classes. Matching features are then searched using random image pairs of a same class. The goal of this thesis is to find out what are the best detector and descriptor methods for such task in terms of detector repeatability and descriptor matching rate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the problem of categorizing natural objects. To provide a criteria for categorization we propose that the purpose of a categorization is to support the inference of unobserved properties of objects from the observed properties. Because no such set of categories can be constructed in an arbitrary world, we present the Principle of Natural Modes as a claim about the structure of the world. We first define an evaluation function that measures how well a set of categories supports the inference goals of the observer. Entropy measures for property uncertainty and category uncertainty are combined through a free parameter that reflects the goals of the observer. Natural categorizations are shown to be those that are stable with respect to this free parameter. The evaluation function is tested in the domain of leaves and is found to be sensitive to the structure of the natural categories corresponding to the different species. We next develop a categorization paradigm that utilizes the categorization evaluation function in recovering natural categories. A statistical hypothesis generation algorithm is presented that is shown to be an effective categorization procedure. Examples drawn from several natural domains are presented, including data known to be a difficult test case for numerical categorization techniques. We next extend the categorization paradigm such that multiple levels of natural categories are recovered; by means of recursively invoking the categorization procedure both the genera and species are recovered in a population of anaerobic bacteria. Finally, a method is presented for evaluating the utility of features in recovering natural categories. This method also provides a mechanism for determining which features are constrained by the different processes present in a multiple modal world.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hemodynamic imaging results have associated both gender and body weight to variation in brain responses to food-related information. However, the spatio-temporal brain dynamics of gender-related and weight-wise modulations in food discrimination still remain to be elucidated. We analyzed visual evoked potentials (VEPs) while normal-weighted men (n = 12) and women (n = 12) categorized photographs of energy-dense foods and non-food kitchen utensils. VEP analyses showed that food categorization is influenced by gender as early as 170 ms after image onset. Moreover, the female VEP pattern to food categorization co-varied with participants' body weight. Estimations of the neural generator activity over the time interval of VEP modulations (i.e. by means of a distributed linear inverse solution [LAURA]) revealed alterations in prefrontal and temporo-parietal source activity as a function of image category and participants' gender. However, only neural source activity for female responses during food viewing was negatively correlated with body-mass index (BMI) over the respective time interval. Women showed decreased neural source activity particularly in ventral prefrontal brain regions when viewing food, but not non-food objects, while no such associations were apparent in male responses to food and non-food viewing. Our study thus indicates that gender influences are already apparent during initial stages of food-related object categorization, with small variations in body weight modulating electrophysiological responses especially in women and in brain areas implicated in food reward valuation and intake control. These findings extend recent reports on prefrontal reward and control circuit responsiveness to food cues and the potential role of this reactivity pattern in the susceptibility to weight gain.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Do our brains implicitly track the energetic content of the foods we see? Using electrical neuroimaging of visual evoked potentials (VEPs) we show that the human brain can rapidly discern food's energetic value, vis à vis its fat content, solely from its visual presentation. Responses to images of high-energy and low-energy food differed over two distinct time periods. The first period, starting at approximately 165 ms post-stimulus onset, followed from modulations in VEP topography and by extension in the configuration of the underlying brain network. Statistical comparison of source estimations identified differences distributed across a wide network including both posterior occipital regions and temporo-parietal cortices typically associated with object processing, and also inferior frontal cortices typically associated with decision-making. During a successive processing stage (starting at approximately 300 ms), responses differed both topographically and in terms of strength, with source estimations differing predominantly within prefrontal cortical regions implicated in reward assessment and decision-making. These effects occur orthogonally to the task that is actually being performed and suggest that reward properties such as a food's energetic content are treated rapidly and in parallel by a distributed network of brain regions involved in object categorization, reward assessment, and decision-making.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis is about detection of local image features. The research topic belongs to the wider area of object detection, which is a machine vision and pattern recognition problem where an object must be detected (located) in an image. State-of-the-art object detection methods often divide the problem into separate interest point detection and local image description steps, but in this thesis a different technique is used, leading to higher quality image features which enable more precise localization. Instead of using interest point detection the landmark positions are marked manually. Therefore, the quality of the image features is not limited by the interest point detection phase and the learning of image features is simplified. The approach combines both interest point detection and local description into one phase for detection. Computational efficiency of the descriptor is therefore important, leaving out many of the commonly used descriptors as unsuitably heavy. Multiresolution Gabor features has been the main descriptor in this thesis and improving their efficiency is a significant part. Actual image features are formed from descriptors by using a classifierwhich can then recognize similar looking patches in new images. The main classifier is based on Gaussian mixture models. Classifiers are used in one-class classifier configuration where there are only positive training samples without explicit background class. The local image feature detection method has been tested with two freely available face detection databases and a proprietary license plate database. The localization performance was very good in these experiments. Other applications applying the same under-lying techniques are also presented, including object categorization and fault detection.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given arbitrary pictures, we explore the possibility of using new techniques from computer vision and artificial intelligence to create customized visual games on-the-fly. This includes coloring books, link-the-dot and spot-the-difference popular games. The feasibility of these systems is discussed and we describe prototype implementation that work well in practice in an automatic or semi-automatic way.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Today, most conventional surveillance networks are based on analog system, which has a lot of constraints like manpower and high-bandwidth requirements. It becomes the barrier for today's surveillance network development. This dissertation describes a digital surveillance network architecture based on the H.264 coding/decoding (CODEC) System-on-a-Chip (SoC) platform. The proposed digital surveillance network architecture includes three major layers: software layer, hardware layer, and the network layer. The following outlines the contributions to the proposed digital surveillance network architecture. (1) We implement an object recognition system and an object categorization system on the software layer by applying several Digital Image Processing (DIP) algorithms. (2) For better compression ratio and higher video quality transfer, we implement two new modules on the hardware layer of the H.264 CODEC core, i.e., the background elimination module and the Directional Discrete Cosine Transform (DDCT) module. (3) Furthermore, we introduce a Digital Signal Processor (DSP) sub-system on the main bus of H.264 SoC platforms as the major hardware support system for our software architecture. Thus we combine the software and hardware platforms to be an intelligent surveillance node. Lab results show that the proposed surveillance node can dramatically save the network resources like bandwidth and storage capacity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

According to much evidence, observing objects activates two types of information: structural properties, i.e., the visual information about the structural features of objects, and function knowledge, i.e., the conceptual information about their skilful use. Many studies so far have focused on the role played by these two kinds of information during object recognition and on their neural underpinnings. However, to the best of our knowledge no study so far has focused on the different activation of this information (structural vs. function) during object manipulation and conceptualization, depending on the age of participants and on the level of object familiarity (familiar vs. non-familiar). Therefore, the main aim of this dissertation was to investigate how actions and concepts related to familiar and non-familiar objects may vary across development. To pursue this aim, four studies were carried out. A first study led to the creation of the Familiar and Non-Familiar Stimuli Database, a set of everyday objects classified by Italian pre-schoolers, schoolers, and adults, useful to verify how object knowledge is modulated by age and frequency of use. A parallel study demonstrated that factors such as sociocultural dynamics may affect the perception of objects. Specifically, data for familiarity, naming, function, using and frequency of use of the objects used to create the Familiar And Non-Familiar Stimuli Database were collected with Dutch and Croatian children and adults. The last two studies on object interaction and language provide further evidence in support of the literature on affordances and on the link between affordances and the cognitive process of language from a developmental point of view, supporting the perspective of a situated cognition and emphasizing the crucial role of human experience.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation presented at the Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer Engineering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Un dels principals problemes de la interacció dels robots autònoms és el coneixement de l'escena. El reconeixement és fonamental per a solucionar aquest problema i permetre als robots interactuar en un escenari no controlat. En aquest document presentem una aplicació pràctica de la captura d'objectes, de la normalització i de la classificació de senyals triangulars i circulars. El sistema s'introdueix en el robot Aibo de Sony per a millorar-ne la interacció. La metodologia presentada s'ha comprobat en simulacions i problemes de categorització reals, com ara la classificació de senyals de trànsit, amb resultats molt prometedors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel scheme ("Categorical Basis Functions", CBF) for object class representation in the brain and contrast it to the "Chorus of Prototypes" scheme recently proposed by Edelman. The power and flexibility of CBF is demonstrated in two examples. CBF is then applied to investigate the phenomenon of Categorical Perception, in particular the finding by Bulthoff et al. (1998) of categorization of faces by gender without corresponding Categorical Perception. Here, CBF makes predictions that can be tested in a psychophysical experiment. Finally, experiments are suggested to further test CBF.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding how biological visual systems perform object recognition is one of the ultimate goals in computational neuroscience. Among the biological models of recognition the main distinctions are between feedforward and feedback and between object-centered and view-centered. From a computational viewpoint the different recognition tasks - for instance categorization and identification - are very similar, representing different trade-offs between specificity and invariance. Thus the different tasks do not strictly require different classes of models. The focus of the review is on feedforward, view-based models that are supported by psychophysical and physiological data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To recognize a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character both of natural kinds and of artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a recent experiment, Freedman et al. recorded from inferotemporal (IT) and prefrontal cortices (PFC) of monkeys performing a "cat/dog" categorization task (Freedman 2001 and Freedman, Riesenhuber, Poggio, Miller 2001). In this paper we analyze the tuning properties of view-tuned units in our HMAX model of object recognition in cortex (Riesenhuber 1999) using the same paradigm and stimuli as in the experiment. We then compare the simulation results to the monkey inferotemporal neuron population data. We find that view-tuned model IT units that were trained without any explicit category information can show category-related tuning as observed in the experiment. This suggests that the tuning properties of experimental IT neurons might primarily be shaped by bottom-up stimulus-space statistics, with little influence of top-down task-specific information. The population of experimental PFC neurons, on the other hand, shows tuning properties that cannot be explained just by stimulus tuning. These analyses are compatible with a model of object recognition in cortex (Riesenhuber 2000) in which a population of shape-tuned neurons provides a general basis for neurons tuned to different recognition tasks.