999 resultados para object categorization


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present an improved model for line and edge detection in cortical area V1. This model is based on responses of simple and complex cells, and it is multi-scale with no free parameters. We illustrate the use of the multi-scale line/edge representation in different processes: visual reconstruction or brightness perception, automatic scale selection and object segregation. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only and final categorization on coarse plus fine scales. We also present a multi-scale object and face recognition model. Processing schemes are discussed in the framework of a complete cortical architecture. The fact that brightness perception and object recognition may be based on the same symbolic image representation is an indication that the entire (visual) cortex is involved in consciousness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis addresses the problem of categorizing natural objects. To provide a criteria for categorization we propose that the purpose of a categorization is to support the inference of unobserved properties of objects from the observed properties. Because no such set of categories can be constructed in an arbitrary world, we present the Principle of Natural Modes as a claim about the structure of the world. We first define an evaluation function that measures how well a set of categories supports the inference goals of the observer. Entropy measures for property uncertainty and category uncertainty are combined through a free parameter that reflects the goals of the observer. Natural categorizations are shown to be those that are stable with respect to this free parameter. The evaluation function is tested in the domain of leaves and is found to be sensitive to the structure of the natural categories corresponding to the different species. We next develop a categorization paradigm that utilizes the categorization evaluation function in recovering natural categories. A statistical hypothesis generation algorithm is presented that is shown to be an effective categorization procedure. Examples drawn from several natural domains are presented, including data known to be a difficult test case for numerical categorization techniques. We next extend the categorization paradigm such that multiple levels of natural categories are recovered; by means of recursively invoking the categorization procedure both the genera and species are recovered in a population of anaerobic bacteria. Finally, a method is presented for evaluating the utility of features in recovering natural categories. This method also provides a mechanism for determining which features are constrained by the different processes present in a multiple modal world.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sparse representation has been introduced to address many recognition problems in computer vision. In this paper, we propose a new framework for object categorization based on sparse representation of local features. Unlike most of previous sparse coding based methods in object classification that only use sparse coding to extract high-level features, the proposed method incorporates sparse representation and classification into a unified framework. Therefore, it does not need a further classifier. Experimental results show that the proposed method achieved better or comparable accuracy than the well known bag-of-features representation with various classifiers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present an improved scheme for line and edge detection in cortical area V1, based on responses of simple and complex cells, truly multi-scale with no free parameters. We illustrate the multi-scale representation for visual reconstruction, and show how object segregation can be achieved with coarse-to-finescale groupings. A two-level object categorization scenario is tested in which pre-categorization is based on coarse scales only, and final categorization on coarse plus fine scales. Processing schemes are discussed in the framework of a complete cortical architecture.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Keypoints (junctions) provide important information for focus-of-attention (FoA) and object categorization/recognition. In this paper we analyze the multi-scale keypoint representation, obtained by applying a linear and quasi-continuous scaling to an optimized model of cortical end-stopped cells, in order to study its importance and possibilities for developing a visual, cortical architecture.We show that keypoints, especially those which are stable over larger scale intervals, can provide a hierarchically structured saliency map for FoA and object recognition. In addition, the application of non-classical receptive field inhibition to keypoint detection allows to distinguish contour keypoints from texture (surface) keypoints.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper(1) presents novel algorithms and applications for a particular class of mixed-norm regularization based Multiple Kernel Learning (MKL) formulations. The formulations assume that the given kernels are grouped and employ l(1) norm regularization for promoting sparsity within RKHS norms of each group and l(s), s >= 2 norm regularization for promoting non-sparse combinations across groups. Various sparsity levels in combining the kernels can be achieved by varying the grouping of kernels-hence we name the formulations as Variable Sparsity Kernel Learning (VSKL) formulations. While previous attempts have a non-convex formulation, here we present a convex formulation which admits efficient Mirror-Descent (MD) based solving techniques. The proposed MD based algorithm optimizes over product of simplices and has a computational complexity of O (m(2)n(tot) log n(max)/epsilon(2)) where m is no. training data points, n(max), n(tot) are the maximum no. kernels in any group, total no. kernels respectively and epsilon is the error in approximating the objective. A detailed proof of convergence of the algorithm is also presented. Experimental results show that the VSKL formulations are well-suited for multi-modal learning tasks like object categorization. Results also show that the MD based algorithm outperforms state-of-the-art MKL solvers in terms of computational efficiency.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose a new learning method to infer a mid-level feature representation that combines the advantage of semantic attribute representations with the higher expressive power of non-semantic features. The idea lies in augmenting an existing attribute-based representation with additional dimensions for which an autoencoder model is coupled with a large-margin principle. This construction allows a smooth transition between the zero-shot regime with no training example, the unsupervised regime with training examples but without class labels, and the supervised regime with training examples and with class labels. The resulting optimization problem can be solved efficiently, because several of the necessity steps have closed-form solutions. Through extensive experiments we show that the augmented representation achieves better results in terms of object categorization accuracy than the semantic representation alone. © 2012 Springer-Verlag.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Lines and edges provide important information for object categorization and recognition. In addition, one brightness model is based on a symbolic interpretation of the cortical multi-scale line/edge representation. In this paper we present an improved scheme for line/edge extraction from simple and complex cells and we illustrate the multi-scale representation. This representation can be used for visual reconstruction, but also for nonphotorealistic rendering. Together with keypoints and a new model of disparity estimation, a 3D wireframe representation of e.g. faces can be obtained in the future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose a joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters. Formulating an optimization problem that combines the objective function of the classification with the representation error of both labeled and unlabeled data, constrained by sparsity, we propose an algorithm that alternates between solving for subsets of parameters, whilst preserving the sparsity. The method is then evaluated over two important classification problems in computer vision: object categorization of natural images using the Caltech 101 database and face recognition using the Extended Yale B face database. The results show that the proposed method is competitive against other recently proposed sparse overcomplete counterparts and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Given arbitrary pictures, we explore the possibility of using new techniques from computer vision and artificial intelligence to create customized visual games on-the-fly. This includes coloring books, link-the-dot and spot-the-difference popular games. The feasibility of these systems is discussed and we describe prototype implementation that work well in practice in an automatic or semi-automatic way.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Today, most conventional surveillance networks are based on analog system, which has a lot of constraints like manpower and high-bandwidth requirements. It becomes the barrier for today's surveillance network development. This dissertation describes a digital surveillance network architecture based on the H.264 coding/decoding (CODEC) System-on-a-Chip (SoC) platform. The proposed digital surveillance network architecture includes three major layers: software layer, hardware layer, and the network layer. The following outlines the contributions to the proposed digital surveillance network architecture. (1) We implement an object recognition system and an object categorization system on the software layer by applying several Digital Image Processing (DIP) algorithms. (2) For better compression ratio and higher video quality transfer, we implement two new modules on the hardware layer of the H.264 CODEC core, i.e., the background elimination module and the Directional Discrete Cosine Transform (DDCT) module. (3) Furthermore, we introduce a Digital Signal Processor (DSP) sub-system on the main bus of H.264 SoC platforms as the major hardware support system for our software architecture. Thus we combine the software and hardware platforms to be an intelligent surveillance node. Lab results show that the proposed surveillance node can dramatically save the network resources like bandwidth and storage capacity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Long-lasting interference effects in picture naming are induced when objects are presented in categorically related contexts in both continuous and blocked cyclic paradigms. Less consistent context effects have been reported when the task is changed to semantic classification. Experiment 1 confirmed the recent finding of cumulative facilitation in the continuous paradigm with living/non-living superordinate categorization. To avoid a potential confound involving participants responding with the identical superordinate category in related contexts in the blocked cyclic paradigm, we devised a novel set of categorically related objects that also varied in terms of relative age – a core semantic type associated with the adjective word class across languages. Experiment 2 demonstrated the typical interference effect with these stimuli in basic level naming. In Experiment 3, using the identical blocked cyclic paradigm, we failed to observe semantic context effects when the same pictures were classified as younger–older. Overall, the results indicate the semantic context effects in the two paradigms do not share a common origin, with the effect in the continuous paradigm arising at the level of conceptual representations or in conceptual-to-lexical connections while the effect in the blocked cyclic paradigm most likely originates at a lexical level of representation. The implications of these findings for current accounts of long-lasting interference effects in spoken word production are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.