31 resultados para objectivity without objects
Resumo:
The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.
Resumo:
This thesis addresses the problem of recognizing solid objects in the three-dimensional world, using two-dimensional shape information extracted from a single image. Objects can be partly occluded and can occur in cluttered scenes. A model based approach is taken, where stored models are matched to an image. The matching problem is separated into two stages, which employ different representations of objects. The first stage uses the smallest possible number of local features to find transformations from a model to an image. This minimizes the amount of search required in recognition. The second stage uses the entire edge contour of an object to verify each transformation. This reduces the chance of finding false matches.
Resumo:
This thesis addresses the problem of synthesizing grasps that are force-closure and stable. The synthesis of force-closure grasps constructs independent regions of contact for the fingertips, such that the motion of the grasped object is totally constrained. The synthesis of stable grasps constructs virtual springs at the contacts, such that the grasped object is stable, and has a desired stiffness matrix about its stable equilibrium. A grasp on an object is force-closure if and only if we can exert, through the set of contacts, arbitrary forces and moments on the object. So force-closure implies equilibrium exists because zero forces and moment is spanned. In the reverse direction, we prove that a non-marginal equilibrium grasp is also a force-closure grasp, if it has at least two point contacts with friction in 2D, or two soft-finger contacts or three hard-finger contacts in 3D. Next, we prove that all force-closure grasps can be made stable, by using either active or passive springs at the contacts. The thesis develops a simple relation between the stability and stiffness of the grasp and the spatial configuration of the virtual springs at the contacts. The stiffness of the grasp depends also on whether the points of contact stick, or slide without friction on straight or curved surfaces of the object. The thesis presents fast and simple algorithms for directly constructing stable fore-closure grasps based on the shape of the grasped object. The formal framework of force-closure and stable grasps provides a partial explanation to why we stably grasp objects to easily, and to why our fingers are better soft than hard.
Resumo:
We present the results of an implemented system for learning structural prototypes from grey-scale images. We show how to divide an object into subparts and how to encode the properties of these subparts and the relations between them. We discuss the importance of hierarchy and grouping in representing objects and show how a notion of visual similarities can be embedded in the description language. Finally we exhibit a learning algorithm that forms class models from the descriptions produced and uses these models to recognize new members of the class.
Resumo:
An approach towards shape description, based on prototype modification and generalized cylinders, has been developed and applied to the object domains pottery and polyhedra: (1) A program describes and identifies pottery from vase outlines entered as lists of points. The descriptions have been modeled after descriptions by archeologists, with the result that identifications made by the program are remarkably consisten with those of the archeologists. It has been possible to quantify their shape descriptors, which are everyday terms in our language applied to many sorts of objects besides pottery, so that the resulting descriptions seem very natural. (2) New parsing strategies for polyhedra overcome some limitations of previous work. A special feature is that the processes of parsing and identification are carried out simultaneously.
Resumo:
Methods are presented (1) to partition or decompose a visual scene into the bodies forming it; (2) to position these bodies in three-dimensional space, by combining two scenes that make a stereoscopic pair; (3) to find the regions or zones of a visual scene that belong to its background; (4) to carry out the isolation of objects in (1) when the input has inaccuracies. Running computer programs implement the methods, and many examples illustrate their behavior. The input is a two-dimensional line-drawing of the scene, assumed to contain three-dimensional bodies possessing flat faces (polyhedra); some of them may be partially occluded. Suggestions are made for extending the work to curved objects. Some comparisons are made with human visual perception. The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (on the basis of pure form); in fact, successful methods are shown.
Resumo:
This thesis investigates what knowledge is necessary to solve mechanics problems. A program NEWTON is described which understands and solves problems in mechanics mini-world of objects moving on surfaces. Facts and equations such as those given in mechanics text need to be represented. However, this is far from sufficient to solve problems. Human problem solvers rely on "common sense" and "qualitative" knowledge which the physics text tacitly assumes to be present. A mechanics problem solver must embody such knowledge. Quantitative knowledge given by equations and more qualitative common sense knowledge are the major research points exposited in this thesis. The major issue in solving problems is planning. Planning involves tentatively outlining a possible path to the solution without actually solving the problem. Such a plan needs to be constructed and debugged in the process of solving the problem. Envisionment, or qualitative simulation of the event, plays a central role in this planning process.
Resumo:
The flexibility of the robot is the key to its success as a viable aid to production. Flexibility of a robot can be explained in two directions. The first is to increase the physical generality of the robot such that it can be easily reconfigured to handle a wide variety of tasks. The second direction is to increase the ability of the robot to interact with its environment such that tasks can still be successfully completed in the presence of uncertainties. The use of articulated hands are capable of adapting to a wide variety of grasp shapes, hence reducing the need for special tooling. The availability of low mass, high bandwidth points close to the manipulated object also offers significant improvements I the control of fine motions. This thesis provides a framework for using articulated hands to perform local manipulation of objects. N particular, it addresses the issues in effecting compliant motions of objects in Cartesian space. The Stanford/JPL hand is used as an example to illustrate a number of concepts. The examples provide a unified methodology for controlling articulated hands grasping with point contacts. We also present a high-level hand programming system based on the methodologies developed in this thesis. Compliant motion of grasped objects and dexterous manipulations can be easily described in the LISP-based hand programming language.
Resumo:
This thesis addresses the problem of developing automatic grasping capabilities for robotic hands. Using a 2-jointed and a 4-jointed nmodel of the hand, we establish the geometric conditions necessary for achieving form closure grasps of cylindrical objects. We then define and show how to construct the grasping pre-image for quasi-static (friction dominated) and zero-G (inertia dominated) motions for sensorless and sensor-driven grasps with and without arm motions. While the approach does not rely on detailed modeling, it is computationally inexpensive, reliable, and easy to implement. Example behaviors were successfully implemented on the Salisbury hand and on a planar 2-fingered, 4 degree-of-freedom hand.
Resumo:
A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object.
Resumo:
A method is presented for the visual analysis of objects by computer. It is particularly well suited for opaque objects with smoothly curved surfaces. The method extracts information about the object's surface properties, including measures of its specularity, texture, and regularity. It also aids in determining the object's shape. The application of this method to a simple recognition task ??e recognition of fruit ?? discussed. The results on a more complex smoothly curved object, a human face, are also considered.
Resumo:
This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain a system for object localization, segmentation, and recognition, starting from very little. What the robot starts with is a direct solution to achieving figure/ground separation: it simply 'pokes around' in a region of visual ambiguity and watches what happens. If the arm passes through an area, that area is recognized as free space. If the arm collides with an object, causing it to move, the robot can use that motion to segment the object from the background. Once the robot can acquire reliable segmented views of objects, it learns from them, and from then on recognizes and segments those objects without further contact. Both low-level and high-level visual features can also be learned in this way, and examples are presented for both: orientation detection and affordance recognition, respectively. The motivation for this work is simple. Training on large corpora of annotated real-world data has proven crucial for creating robust solutions to perceptual problems such as speech recognition and face detection. But the powerful tools used during training of such systems are typically stripped away at deployment. Ideally they should remain, particularly for unstable tasks such as object detection, where the set of objects needed in a task tomorrow might be different from the set of objects needed today. The key limiting factor is access to training data, but as this thesis shows, that need not be a problem on a robotic platform that can actively probe its environment, and carry out experiments to resolve ambiguity. This work is an instance of a general approach to learning a new perceptual judgment: find special situations in which the perceptual judgment is easy and study these situations to find correlated features that can be observed more generally.
Resumo:
We present a new method for rendering novel images of flexible 3D objects from a small number of example images in correspondence. The strength of the method is the ability to synthesize images whose viewing position is significantly far away from the viewing cone of the example images ("view extrapolation"), yet without ever modeling the 3D structure of the scene. The method relies on synthesizing a chain of "trilinear tensors" that governs the warping function from the example images to the novel image, together with a multi-dimensional interpolation function that synthesizes the non-rigid motions of the viewed object from the virtual camera position. We show that two closely spaced example images alone are sufficient in practice to synthesize a significant viewing cone, thus demonstrating the ability of representing an object by a relatively small number of model images --- for the purpose of cheap and fast viewers that can run on standard hardware.
Resumo:
We discuss a variety of object recognition experiments in which human subjects were presented with realistically rendered images of computer-generated three-dimensional objects, with tight control over stimulus shape, surface properties, illumination, and viewpoint, as well as subjects' prior exposure to the stimulus objects. In all experiments recognition performance was: (1) consistently viewpoint dependent; (2) only partially aided by binocular stereo and other depth information, (3) specific to viewpoints that were familiar; (4) systematically disrupted by rotation in depth more than by deforming the two-dimensional images of the stimuli. These results are consistent with recently advanced computational theories of recognition based on view interpolation.
Resumo:
Stimuli outside classical receptive fields significantly influence the neurons' activities in primary visual cortex. We propose that such contextual influences are used to segment regions by detecting the breakdown of homogeneity or translation invariance in the input, thus computing global region boundaries using local interactions. This is implemented in a biologically based model of V1, and demonstrated in examples of texture segmentation and figure-ground segregation. By contrast with traditional approaches, segmentation occurs without classification or comparison of features within or between regions and is performed by exactly the same neural circuit responsible for the dual problem of the grouping and enhancement of contours.