947 resultados para Grandmother Model for Vision


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ "featurally" are much easier to distinguish when inverted than those that differ "configurally" (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) ??finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjects' expectations, there is no difference between "featurally" and "configurally" transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Usually, generalization is considered as a function of learning from a set of examples. In present work on the basis of recent neural network assembly memory model (NNAMM), a biologically plausible 'grandmother' model for vision, where each separate memory unit itself can generalize, has been proposed. For such a generalization by computation through memory, analytical formulae and numerical procedure are found to calculate exactly the perfectly learned memory unit's generalization ability. The model's memory has complex hierarchical structure, can be learned from one example by a one-step process, and may be considered as a semi-representational one. A simple binary neural network for bell-shaped tuning is described.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel hypothesis on the function of massive feedback pathways in mammalian visual systems. We propose that the cortical feature detectors compete not for the right to represent the output at a point, but for exclusive rights to abstract and represent part of the underlying input. Feedback can do this very naturally. A computational model that implements the above idea for the problem of line detection is presented and based on that we suggest a functional role for the thalamo-cortical loop during perception of lines. We show that the model successfully tackles the so called Cross problem. Based on some recent experimental results, we discuss the biological plausibility of our model. We also comment on the relevance of our hypothesis (on the role of feedback) to general sensory information processing and recognition. (C) 1998 Published by Elsevier Science Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Model based vision allows use of prior knowledge of the shape and appearance of specific objects to be used in the interpretation of a visual scene; it provides a powerful and natural way to enforce the view consistency constraint. A model based vision system has been developed within ESPRIT VIEWS: P2152 which is able to classify and track moving objects (cars and other vehicles) in complex, cluttered traffic scenes. The fundamental basis of the method has been previously reported. This paper presents recent developments which have extended the scope of the system to include (i) multiple cameras, (ii) variable camera geometry, and (iii) articulated objects. All three enhancements have easily been accommodated within the original model-based approach

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the quest for shorter time-to-market, higher quality and reduced cost, model-driven software development has emerged as a promising approach to software engineering. The central idea is to promote models to first-class citizens in the development process. Starting from a set of very abstract models in the early stage of the development, they are refined into more concrete models and finally, as a last step, into code. As early phases of development focus on different concepts compared to later stages, various modelling languages are employed to most accurately capture the concepts and relations under discussion. In light of this refinement process, translating between modelling languages becomes a time-consuming and error-prone necessity. This is remedied by model transformations providing support for reusing and automating recurring translation efforts. These transformations typically can only be used to translate a source model into a target model, but not vice versa. This poses a problem if the target model is subject to change. In this case the models get out of sync and therefore do not constitute a coherent description of the software system anymore, leading to erroneous results in later stages. This is a serious threat to the promised benefits of quality, cost-saving, and time-to-market. Therefore, providing a means to restore synchronisation after changes to models is crucial if the model-driven vision is to be realised. This process of reflecting changes made to a target model back to the source model is commonly known as Round-Trip Engineering (RTE). While there are a number of approaches to this problem, they impose restrictions on the nature of the model transformation. Typically, in order for a transformation to be reversed, for every change to the target model there must be exactly one change to the source model. While this makes synchronisation relatively “easy”, it is ill-suited for many practically relevant transformations as they do not have this one-to-one character. To overcome these issues and to provide a more general approach to RTE, this thesis puts forward an approach in two stages. First, a formal understanding of model synchronisation on the basis of non-injective transformations (where a number of different source models can correspond to the same target model) is established. Second, detailed techniques are devised that allow the implementation of this understanding of synchronisation. A formal underpinning for these techniques is drawn from abductive logic reasoning, which allows the inference of explanations from an observation in the context of a background theory. As non-injective transformations are the subject of this research, there might be a number of changes to the source model that all equally reflect a certain target model change. To help guide the procedure in finding “good” source changes, model metrics and heuristics are investigated. Combining abductive reasoning with best-first search and a “suitable” heuristic enables efficient computation of a number of “good” source changes. With this procedure Round-Trip Engineering of non-injective transformations can be supported.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a preliminary flight test based detection range versus false alarm performance characterisation of a morphological-hidden Markov model filtering approach to vision-based airborne dim-target collision detection. On the basis of compelling in-flight collision scenario data, we calculate system operating characteristic (SOC) curves that concisely illustrate the detection range versus false alarm rate performance design trade-offs. These preliminary SOC curves provide a more complete dim-target detection performance description than previous studies (due to the experimental difficulties involved, previous studies have been limited to very short flight data sample sets and hence have not been able to quantify false alarm behaviour). The preliminary investigation here is based on data collected from 4 controlled collision encounters and supporting non-target flight data. This study suggests head-on detection ranges of approximately 2.22 km under blue sky background conditions (1.26 km in cluttered background conditions), whilst experiencing false alarms at a rate less than 1.7 false alarms/hour (ie. less than once every 36 minutes). Further data collection is currently in progress.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis describes the development of a model-based vision system that exploits hierarchies of both object structure and object scale. The focus of the research is to use these hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to recognize parameterized instances of non-rigid model objects contained in a large knowledge base despite the presence of noise and occlusion. Robustness is achieved by developing a system that can recognize viewed objects that are scaled or mirror-image instances of the known models or that contain components sub-parts with different relative scaling, rotation, or translation than in models. The approach taken in this thesis is to develop an object shape representation that incorporates a component sub-part hierarchy- to allow for efficient and correct indexing into an automatically generated model library as well as for relative parameterization among sub-parts, and a scale hierarchy- to allow for a general to specific recognition procedure. After analysis of the issues and inherent tradeoffs in the recognition process, a system is implemented using a representation based on significant contour curvature changes and a recognition engine based on geometric constraints of feature properties. Examples of the system's performance are given, followed by an analysis of the results. In conclusion, the system's benefits and limitations are presented.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Point Distribution Model (PDM) has proven effective in modelling variations in shape in sets of images, including those in which motion is involved such as body and hand tracking. This paper proposes an extension to the PDM through a re-parameterisation of the model which uses factors such as the angular velocity and distance travelled for sets of points on a moving shape. This then enables non-linear quantities such as acceleration and the average velocity of the body to be expressed in a linear model by the PDM. Results are shown for objects with known acceleration and deceleration components, these being a simulated pendulum modelled using simple harmonic motion and video sequences of a real pendulum in motion.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

National Highway Traffic Safety Administration, Washington, D.C.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

National Highway Traffic Safety Administration, Washington, D.C.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

How do signals from the 2 eyes combine and interact? Our recent work has challenged earlier schemes in which monocular contrast signals are subject to square-law transduction followed by summation across eyes and binocular gain control. Much more successful was a new 'two-stage' model in which the initial transducer was almost linear and contrast gain control occurred both pre- and post-binocular summation. Here we extend that work by: (i) exploring the two-dimensional stimulus space (defined by left- and right-eye contrasts) more thoroughly, and (ii) performing contrast discrimination and contrast matching tasks for the same stimuli. Twenty-five base-stimuli made from 1 c/deg patches of horizontal grating, were defined by the factorial combination of 5 contrasts for the left eye (0.3-32%) with five contrasts for the right eye (0.3-32%). Other than in contrast, the gratings in the two eyes were identical. In a 2IFC discrimination task, the base-stimuli were masks (pedestals), where the contrast increment was presented to one eye only. In a matching task, the base-stimuli were standards to which observers matched the contrast of either a monocular or binocular test grating. In the model, discrimination depends on the local gradient of the observer's internal contrast-response function, while matching equates the magnitude (rather than gradient) of response to the test and standard. With all model parameters fixed by previous work, the two-stage model successfully predicted both the discrimination and the matching data and was much more successful than linear or quadratic binocular summation models. These results show that performance measures and perception (contrast discrimination and contrast matching) can be understood in the same theoretical framework for binocular contrast vision. © 2007 VSP.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In model-based vision, there are a huge number of possible ways to match model features to image features. In addition to model shape constraints, there are important match-independent constraints that can efficiently reduce the search without the combinatorics of matching. I demonstrate two specific modules in the context of a complete recognition system, Reggie. The first is a region-based grouping mechanism to find groups of image features that are likely to come from a single object. The second is an interpretive matching scheme to make explicit hypotheses about occlusion and instabilities in the image features.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision.