45 resultados para Robot Vision
Resumo:
The development of increasingly sophisticated and powerful computers in the last few decades has frequently stimulated comparisons between them and the human brain. Such comparisons will become more earnest as computers are applied more and more to tasks formerly associated with essentially human activities and capabilities. The expectation of a coming generation of "intelligent" computers and robots with sensory, motor and even "intellectual" skills comparable in quality to (and quantitatively surpassing) our own is becoming more widespread and is, I believe, leading to a new and potentially productive analytical science of "information processing". In no field has this new approach been so precisely formulated and so thoroughly exemplified as in the field of vision. As the dominant sensory modality of man, vision is one of the major keys to our mastery of the environment, to our understanding and control of the objects which surround us. If we wish to created robots capable of performing complex manipulative tasks in a changing environment, we must surely endow them with (among other things) adequate visual powers. How can we set about designing such flexible and adaptive robots? In designing them, can we make use of our rapidly growing knowledge of the human brain, and if so, how at the same time, can our experiences in designing artificial vision systems help us to understand how the brain analyzes visual information?
Resumo:
While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.
Resumo:
In this report, I discuss the use of vision to support concrete, everyday activity. I will argue that a variety of interesting tasks can be solved using simple and inexpensive vision systems. I will provide a number of working examples in the form of a state-of-the-art mobile robot, Polly, which uses vision to give primitive tours of the seventh floor of the MIT AI Laboratory. By current standards, the robot has a broad behavioral repertoire and is both simple and inexpensive (the complete robot was built for less than $20,000 using commercial board-level components). The approach I will use will be to treat the structure of the agent's activity---its task and environment---as positive resources for the vision system designer. By performing a careful analysis of task and environment, the designer can determine a broad space of mechanisms which can perform the desired activity. My principal thesis is that for a broad range of activities, the space of applicable mechanisms will be broad enough to include a number mechanisms which are simple and economical. The simplest mechanisms that solve a given problem will typically be quite specialized to that problem. One thus worries that building simple vision systems will be require a great deal of {it ad-hoc} engineering that cannot be transferred to other problems. My second thesis is that specialized systems can be analyzed and understood in a principled manner, one that allows general lessons to be extracted from specialized systems. I will present a general approach to analyzing specialization through the use of transformations that provably improve performance. By demonstrating a sequence of transformations that derive a specialized system from a more general one, we can summarize the specialization of the former in a compact form that makes explicit the additional assumptions that it makes about its environment. The summary can be used to predict the performance of the system in novel environments. Individual transformations can be recycled in the design of future systems.
Resumo:
This thesis presents methods for implementing robust hexpod locomotion on an autonomous robot with many sensors and actuators. The controller is based on the Subsumption Architecture and is fully distributed over approximately 1500 simple, concurrent processes. The robot, Hannibal, weighs approximately 6 pounds and is equipped with over 100 physical sensors, 19 degrees of freedom, and 8 on board computers. We investigate the following topics in depth: distributed control of a complex robot, insect-inspired locomotion control for gait generation and rough terrain mobility, and fault tolerance. The controller was implemented, debugged, and tested on Hannibal. Through a series of experiments, we examined Hannibal's gait generation, rough terrain locomotion, and fault tolerance performance. These results demonstrate that Hannibal exhibits robust, flexible, real-time locomotion over a variety of terrain and tolerates a multitude of hardware failures.
Resumo:
We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non-cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures.
Resumo:
This thesis describes Sonja, a system which uses instructions in the course of visually-guided activity. The thesis explores an integration of research in vision, activity, and natural language pragmatics. Sonja's visual system demonstrates the use of several intermediate visual processes, particularly visual search and routines, previously proposed on psychophysical grounds. The computations Sonja performs are compatible with the constraints imposed by neuroscientifically plausible hardware. Although Sonja can operate autonomously, it can also make flexible use of instructions provided by a human advisor. The system grounds its understanding of these instructions in perception and action.
Resumo:
The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.
Resumo:
Robots must plan and execute tasks in the presence of uncertainty. Uncertainty arises from sensing errors, control errors, and uncertainty in the geometry of the environment. The last, which is called model error, has received little previous attention. We present a framework for computing motion strategies that are guaranteed to succeed in the presence of all three kinds of uncertainty. The motion strategies comprise sensor-based gross motions, compliant motions, and simple pushing motions.
Resumo:
Automated assembly of mechanical devices is studies by researching methods of operating assembly equipment in a variable manner; that is, systems which may be configured to perform many different assembly operations are studied. The general parts assembly operation involves the removal of alignment errors within some tolerance and without damaging the parts. Two methods for eliminating alignment errors are discussed: a priori suppression and measurement and removal. Both methods are studied with the more novel measurement and removal technique being studied in greater detail. During the study of this technique, a fast and accurate six degree-of-freedom position sensor based on a light-stripe vision technique was developed. Specifications for the sensor were derived from an assembly-system error analysis. Studies on extracting accurate information from the sensor by optimally reducing redundant information, filtering quantization noise, and careful calibration procedures were performed. Prototype assembly systems for both error elimination techniques were implemented and used to assemble several products. The assembly system based on the a priori suppression technique uses a number of mechanical assembly tools and software systems which extend the capabilities of industrial robots. The need for the tools was determined through an assembly task analysis of several consumer and automotive products. The assembly system based on the measurement and removal technique used the six degree-of-freedom position sensor to measure part misalignments. Robot commands for aligning the parts were automatically calculated based on the sensor data and executed.
Resumo:
Redundant sensors are needed on a mobile robot so that the accuracy with which it perceives its surroundings can be increased. Sonar and infrared sensors are used here in tandem, each compensating for deficiencies in the other. The robot combines the data from both sensors to build a representation which is more accurate than if either sensor were used alone. Another representation, the curvature primal sketch, is extracted from this perceived workspace and is used as the input to two path planning programs: one based on configuration space and one based on a generalized cone formulation of free space.
Resumo:
Planner is a formalism for proving theorems and manipulating models in a robot. The formalism is built out of a number of problem-solving primitives together with a hierarchical multiprocess backtrack control structure. Statements can be asserted and perhaps later withdrawn as the state of the world changes. Under BACKTRACK control structure, the hierarchy of activations of functions previously executed is maintained so that it is possible to revert to any previous state. Thus programs can easily manipulate elaborate hypothetical tentative states. In addition PLANNER uses multiprocessing so that there can be multiple loci of changes in state. Goals can be established and dismissed when they are satisfied. The deductive system of PLANNER is subordinate to the hierarchical control structure in order to maintain the desired degree of control. The use of a general-purpose matching language as the basis of the deductive system increases the flexibility of the system. Instead of explicitly naming procedures in calls, procedures can be invoked implicitly by patterns of what the procedure is supposed to accomplish. The language is being applied to solve problems faced by a robot, to write special purpose routines from goal oriented language, to express and prove properties of procedures, to abstract procedures from protocols of their actions, and as a semantic base for English.
Resumo:
The Vision Flashes are informal working papers intended primarily to stimulate internal interaction among participants in the A.I. Laboratory's Vision and Robotics group. Many of them report highly tentative conclusions or incomplete work. Others deal with highly detailed accounts of local equipment and programs that lack general interest. Still others are of great importance, but lack the polish and elaborate attention to proper referencing that characterizes the more formal literature. Nevertheless, the Vision Flashes collectively represent the only documentation of an important fraction of the work done in machine vision and robotics. The purpose of this report is to make the findings more readily available, but since they are not revised as presented here, readers should keep in mind the original purpose of the papers!
Resumo:
This paper describes BUILD, a computer program which generates plans for building specified structures out of simple objects such as toy blocks. A powerful heuristic control structure enables BUILD to use a number of sophisticated construction techniques in its plans. Among these are the incorporation of pre-existing structure into the final design, pre-assembly of movable sub-structures on the table, and use of the extra blocks as temporary supports and counterweights in the course of construction. BUILD does its planning in a modeled 3-space in which blocks of various shapes and sizes can be represented in any orientation and location. The modeling system can maintain several world models at once, and contains modules for displaying states, testing them for inter-object contact and collision, and for checking the stability of complex structures involving frictional forces. Various alternative approaches are discussed, and suggestions are included for the extension of BUILD-like systems to other domains. Also discussed are the merits of BUILD's implementation language, CONNIVER, for this type of problem solving.
Resumo:
Humans can effortlessly manipulate objects in their hands, dexterously sliding and twisting them within their grasp. Robots, however, have none of these capabilities, they simply grasp objects rigidly in their end effectors. To investigate this common form of human manipulation, an analysis of controlled slipping of a grasped object within a robot hand was performed. The Salisbury robot hand demonstrated many of these controlled slipping techniques, illustrating many results of this analysis. First, the possible slipping motions were found as a function of the location, orientation, and types of contact between the hand and object. Second, for a given grasp, the contact types were determined as a function of the grasping force and the external forces on the object. Finally, by changing the grasping force, the robot modified the constraints on the object and affect controlled slipping slipping motions.
Resumo:
Methods are developed for predicting vibration response characteristics of systems which change configuration during operation. A cartesian robot, an example of such a position-dependent system, served as a test case for these methods and was studied in detail. The chosen system model was formulated using the technique of Component Mode Synthesis (CMS). The model assumes that he system is slowly varying, and connects the carriages to each other and to the robot structure at the slowly varying connection points. The modal data required for each component is obtained experimentally in order to get a realistic model. The analysis results in prediction of vibrations that are produced by the inertia forces as well as gravity and friction forces which arise when the robot carriages move with some prescribed motion. Computer simulations and experimental determinations are conducted in order to calculate the vibrations at the robot end-effector. Comparisons are shown to validate the model in two ways: for fixed configuration the mode shapes and natural frequencies are examined, and then for changing configuration the residual vibration at the end of the mode is evaluated. A preliminary study was done on a geometrically nonlinear system which also has position-dependency. The system consisted of a flexible four-bar linkage with elastic input and output shafts. The behavior of the rocker-beam is analyzed for different boundary conditions to show how some limiting cases are obtained. A dimensional analysis leads to an evaluation of the consequences of dynamic similarity on the resulting vibration.