7 resultados para Reference Curves
em Massachusetts Institute of Technology
Resumo:
The inferior temporal cortex (IT) of monkeys is thought to play an essential role in visual object recognition. Inferotemporal neurons are known to respond to complex visual stimuli, including patterns like faces, hands, or other body parts. What is the role of such neurons in object recognition? The present study examines this question in combined psychophysical and electrophysiological experiments, in which monkeys learned to classify and recognize novel visual 3D objects. A population of neurons in IT were found to respond selectively to such objects that the monkeys had recently learned to recognize. A large majority of these cells discharged maximally for one view of the object, while their response fell off gradually as the object was rotated away from the neuron"s preferred view. Most neurons exhibited orientation-dependent responses also during view-plane rotations. Some neurons were found tuned around two views of the same object, while a very small number of cells responded in a view- invariant manner. For five different objects that were extensively used during the training of the animals, and for which behavioral performance became view-independent, multiple cells were found that were tuned around different views of the same object. No selective responses were ever encountered for views that the animal systematically failed to recognize. The results of our experiments suggest that neurons in this area can develop a complex receptive field organization as a consequence of extensive training in the discrimination and recognition of objects. Simple geometric features did not appear to account for the neurons" selective responses. These findings support the idea that a population of neurons -- each tuned to a different object aspect, and each showing a certain degree of invariance to image transformations -- may, as an assembly, encode complex 3D objects. In such a system, several neurons may be active for any given vantage point, with a single unit acting like a blurred template for a limited neighborhood of a single view.
Resumo:
The Saliency Network proposed by Shashua and Ullman is a well-known approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally prefers long and smooth curves over short or wiggly ones. While computing saliencies, the network also fills in gaps with smooth completions and tolerates noise. Finally, the network is locally connected, and its size is proportional to the size of the image. Nevertheless, our analysis reveals certain weaknesses with the method. In particular, we show cases in which the most salient element does not lie on the perceptually most salient curve. Furthermore, in some cases the saliency measure changes its preferences when curves are scaled uniformly. Also, we show that for certain fragmented curves the measure prefers large gaps over a few small gaps of the same total size. In addition, we analyze the time complexity required by the method. We show that the number of steps required for convergence in serial implementations is quadratic in the size of the network, and in parallel implementations is linear in the size of the network. We discuss problems due to coarse sampling of the range of possible orientations. We show that with proper sampling the complexity of the network becomes cubic in the size of the network. Finally, we consider the possibility of using the Saliency Network for grouping. We show that the Saliency Network recovers the most salient curve efficiently, but it has problems with identifying any salient curve other than the most salient one.
Resumo:
We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non-cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures.
Resumo:
MIT Scheme is an implementation of the Scheme programming language that runs on many popular workstations. The MIT Scheme Reference Manual describes the special forms, procedures, and datatypes provided by the implementation for use by application programmers.
Resumo:
Dynamic systems which undergo rapid motion can excite natural frequencies that lead to residual vibration at the end of motion. This work presents a method to shape force profiles that reduce excitation energy at the natural frequencies in order to reduce residual vibration for fast moves. Such profiles are developed using a ramped sinusoid function and its harmonics, choosing coefficients to reduce spectral energy at the natural frequencies of the system. To improve robustness with respect to parameter uncertainty, spectral energy is reduced for a range of frequencies surrounding the nominal natural frequency. An additional set of versine profiles are also constructed to permit motion at constant speed for velocity-limited systems. These shaped force profiles are incorporated into a simple closed-loop system with position and velocity feedback. The force input is doubly integrated to generate a shaped position reference for the controller to follow. This control scheme is evaluated on the MIT Cartesian Robot. The shaped inputs generate motions with minimum residual vibration when actuator saturation is avoided. Feedback control compensates for the effect of friction Using only a knowledge of the natural frequencies of the system to shape the force inputs, vibration can also be attenuated in modes which vibrate in directions other than the motion direction. When moving several axes, the use of shaped inputs allows minimum residual vibration even when the natural frequencies are dynamically changing by a limited amount.
Resumo:
The conceptual component of this work is about "reference surfaces'' which are the dual of reference frames often used for shape representation purposes. The theoretical component of this work involves the question of whether one can find a unique (and simple) mapping that aligns two arbitrary perspective views of an opaque textured quadric surface in 3D, given (i) few corresponding points in the two views, or (ii) the outline conic of the surface in one view (only) and few corresponding points in the two views. The practical component of this work is concerned with applying the theoretical results as tools for the task of achieving full correspondence between views of arbitrary objects.
Resumo:
A program that simulates a Digital Equipment Corporation PDP-11 computer and many of its peripherals on the AI Laboratory Time Sharing System (ITS) is described from a user's reference point of view. This simulator has a built in DDT-like command level which provides the user with the normal range of DDT facilities but also with several special debugging features built into the simulator. The DDT command language was implemented by Richard M. Stallman while the simulator was written by the author of this memo.