49 resultados para Visual Object Recognition

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional methods of object recognition are reliant on shape and so are very difficult to apply in cluttered, wideangle and low-detail views such as surveillance scenes. To address this, a method of indirect object recognition is proposed, where human activity is used to infer both the location and identity of objects. No shape analysis is necessary. The concept is dubbed 'interaction signatures', since the premise is that a human will interact with objects in ways characteristic of the function of that object - for example, a person sits in a chair and drinks from a cup. The human-centred approach means that recognition is possible in low-detail views and is largely invariant to the shape of objects within the same functional class. This paper implements a Bayesian network for classifying region patches with object labels, building upon our previous work in automatically segmenting and recognising a human's interactions with the objects. Experiments show that interaction signatures can successfully find and label objects in low-detail views and are equally effective at recognising test objects that differ markedly in appearance from the training objects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many vision problems deal with high-dimensional data, such as motion segmentation and face clustering. However, these high-dimensional data usually lie in a low-dimensional structure. Sparse representation is a powerful principle for solving a number of clustering problems with high-dimensional data. This principle is motivated from an ideal modeling of data points according to linear algebra theory. However, real data in computer vision are unlikely to follow the ideal model perfectly. In this paper, we exploit the mixed norm regularization for sparse subspace clustering. This regularization term is a convex combination of the l1norm, which promotes sparsity at the individual level and the block norm l2/1 which promotes group sparsity. Combining these powerful regularization terms will provide a more accurate modeling, subsequently leading to a better solution for the affinity matrix used in sparse subspace clustering. This could help us achieve better performance on motion segmentation and face clustering problems. This formulation also caters for different types of data corruptions. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other state-of-arts on both motion segmentation and face clustering.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problem of deriving spatial relationships between objects in general requires high lever' abstract representation, and it would pose difficulties even for human observer. Based on a formalism for spatial layouts proposed earlier, we present methods for deducing spatial relations between objects by an active, sighted agent in a large-scale environment. The deduction of spatial relations is based on simple visual clues, and thus this technique is more feasible than schemes that rely on complex object recognition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problem of deriving spatial relationships between objects in general requires high level abstract representation, and it would pose difficulties even for human observer. Based on a formalism for spatial layouts proposed earlier [KiV92, VeK921, we present methods for deducing high level spatial relations between objects by an active, sighted agent in a large-scale environment. The deduction of spatial relations is based on simple visual clues, and thus this technique is more feasible than schemes that rely on complex object recognition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problem of 3D object recognition is of immense practical importance, with the last decade witnessing a number of breakthroughs in the state of the art. Most of the previous work has focused on the matching of textured objects using local appearance descriptors extracted around salient image points. The recently proposed bag of boundaries method was the first to address directly the problem of matching smooth objects using boundary features. However, no previous work has attempted to achieve a holistic treatment of the problem by jointly using textural and shape features which is what we describe herein. Due to the complementarity of the two modalities, we fuse the corresponding matching scores and learn their relative weighting in a data specific manner by optimizing discriminative performance on synthetically distorted data. For the textural description of an object we adopt a representation in the form of a histogram of SIFT based visual words. Similarly the apparent shape of an object is represented by a histogram of discretized features capturing local shape. On a large public database of a diverse set of objects, the proposed method is shown to outperform significantly both purely textural and purely shape based approaches for matching across viewpoint variation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The problem of object recognition is of immense practical importance and potential, and the last decade has witnessed a number of breakthroughs in the state of the art. Most of the past object recognition work focuses on textured objects and local appearance descriptors extracted around salient points in an image. These methods fail in the matching of smooth, untextured objects for which salient point detection does not produce robust results. The recently proposed bag of boundaries (BoB) method is the first to directly address this problem. Since the texture of smooth objects is largely uninformative, BoB focuses on describing and matching objects based on their post-segmentation boundaries. Herein we address three major weaknesses of this work. The first of these is the uniform treatment of all boundary segments. Instead, we describe a method for detecting the locations and scales of salient boundary segments. Secondly, while the BoB method uses an image based elementary descriptor (HoGs + occupancy matrix), we propose a more compact descriptor based on the local profile of boundary normals’ directions. Lastly, we conduct a far more systematic evaluation, both of the bag of boundaries method and the method proposed here. Using a large public database, we demonstrate that our method exhibits greater robustness while at the same time achieving a major computational saving – object representation is extracted from an image in only 6% of the time needed to extract a bag of boundaries, and the storage requirement is similarly reduced to less than 8%.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper proposes a novel general framework for line segment perception, which is motivated by a biological visual cortex, and requires no parameter tuning. In this framework, we design a model to approximate receptive fields of simple cells. More importantly, the structure of biological orientation columns is imitated by organizing artificial complex and hypercomplex cells with the same orientation into independent arrays. Besides, an interaction mechanism is implemented by a set of self-organization rules. Enlightened by the visual topological theory, the outputs of these artificial cells are integrated to generate line segments that can describe nonlocal structural information of images. Each line segment is evaluated quantitatively by its significance. The computation complexity is also analyzed. The proposed method is tested and compared to state-of-the-art algorithms on real images with complex scenes and strong noises. The experiments demonstrate that our method outperforms the existing methods in the balance between conciseness and completeness.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Monitoring marine object is important for understanding the marine ecosystem and evaluating impacts on different environmental changes. One prerequisite of monitoring is to identify targets of interest. Traditionally, the target objects are recognized by trained scientists through towed nets and human observation, which cause much cost and risk to operators and creatures. In comparison, a noninvasive way via setting up a camera and seeking objects in images is more promising. In this paper, a novel technique of object detection in images is presented, which is applicable to generic objects. A robust background modelling algorithm is proposed to extract foregrounds and then blob features are introduced to classify foregrounds. Particular marine objects, box jellyfish and sea snake, are successfully detected in our work. Experiments conducted on image datasets collected by the Australian Institute of Marine Science (AIMS) demonstrate the effectiveness of the proposed technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper reports a single case of ipsilesional left neglect dyslexia and interprets it according to the three-level model of visual word recognition proposed by Caramazza and Hillis (1990). The three levels reflect a progression from the physical stimulus to an abstract representation of a word. RR was not impaired at the first, retinocentric, level, which represents the individual features of letters within a word according to the location of the word in the visual field: She made the same number of errors to words presented in her left visual field as in her right visual field. A deficit at this level should also mean the patient neglects all stimuli. This did not occur with RR: She did not neglect when naming the items in rows of objects and rows of geometric symbols. In addition, although she displayed significant neglect dyslexia when making visual matching judgements on pairs of words and nonwords, she did not do so to pairs of nonsense letter shapes, shapes which display the same level of visual complexity as letters in words. RR was not impaired at the third, graphemic, level, which represents the ordinal positions of letters within a word: She continued to neglect the leftmost (spatial) letter of words presented in mirror-reversed orientation and she did not neglect in oral spelling. By elimination, these results suggest RR's deficit affects a spatial reference frame where the representational space is bounded by the stimulus: A stimulus-centred level of representation. We define five characteristics of a stimulus-centred deficit, as manifest in RR. First, it is not the case that neglect dyslexia occurs because the remaining letters in a string attract or capture attention away from the leftmost letter(s). Second, the deficit is continuous across the letter string. Third, perceptually significant features, such as spaces, define potential words. Fourth, the whole, rather than part, of a letter is neglected. Fifth, category information is preserved. It is concluded that the Caramazza-Hillis model accounts well for RR's data, although we conclude that neglect dyslexia can be present when a more general visuospatial neglect is absent.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Two little noticed cases in which William Macewen used symptoms of visual agnosia to plan brain surgery on the angular gyrus are reviewed and evaluated. Following a head injury, Macewen’s first patient had an immediate and severe visual object agnosia that lasted for about 2 weeks. After that he gradually became homicidal and depressed and it was for those symptoms that Macewen first saw him, some 11 months after the accident. From his examination, Macewen concluded that the agnosia clearly indicated a lesion in “the posterior portion of the operculum or in the angular gyrus.” When he removed parts of the internal table that had penetrated those structures the homicidal impulses disappeared. Macewen’s second patient was seen for a chronic middle ear infection and, although neither aphasic nor deaf, was ‘word deaf.’ Slightly later he became ‘psychically blind’ as well. Macewen suspected a cerebral abscess pressing on both the angular gyrus and the first temporal convolution. A large subdural abscess was found there and the symptoms disappeared after it was treated. The patients are discussed and Macewen’s positive results analysed in the historical context of the dispute over the proposed role of the angular gyrus as the visual centre.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes the comparison of accuracy and performance of two machine learning approaches for visual object detection and tracking vehicles, from an on-road image sequence. The first is a neural network based approach. Where an algorithm of multi resolution technique based on Haar basis functions was used to obtain an image with different scales. Thereafter a classification was carried out with the multilayer feed forward neural network. Principle Component Analysis (PCA) technique was used as a dimension reduction technique to make the classification process much more efficient. The second approach is based on boosting which also yields very good detection rates. In general, boosting is one of the most important developments in classification methodology. It works by sequentially applying a classification algorithm to reweighed versions of the training data, followed by taking a weighted majority vote of the sequence of classifiers thus produced. For this work, a strong classifier was trained by the adaboost algorithm. The results of comparing the two methodologies visà-vis shows the effectiveness of the methods that have been used.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While the primary purpose of edge detection schemes is to be able to produce an edge map of a given image, the ability to distinguish between different feature types is also of importance. In this paper we examine feature classification based on local energy detection and show that local energy measures are intrinsically capable of making this classification because of the use of odd and even filters. The advantage of feature classification is that it allows for the elimination of certain feature types from the edge map, thus simplifying the task of object recognition.