6 resultados para Visual Information

em Massachusetts Institute of Technology


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Notions of figure-ground, inside-outside are difficult to define in a computational sense, yet seem intuitively meaningful. We propose that "figure" is an attention-directed region of visual information processing, and has a non-discrete boundary. Associated with "figure" is a coordinate frame and a "frame curve" which helps initiate the shape recognition process by selecting and grouping convex image chunks for later matching- to-model. We show that human perception is biased to see chunks outside the frame as more salient than those inside. Specific tasks, however, can reverse this bias. Near/far, top/bottom and expansion/contraction also behave similarly.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The development of increasingly sophisticated and powerful computers in the last few decades has frequently stimulated comparisons between them and the human brain. Such comparisons will become more earnest as computers are applied more and more to tasks formerly associated with essentially human activities and capabilities. The expectation of a coming generation of "intelligent" computers and robots with sensory, motor and even "intellectual" skills comparable in quality to (and quantitatively surpassing) our own is becoming more widespread and is, I believe, leading to a new and potentially productive analytical science of "information processing". In no field has this new approach been so precisely formulated and so thoroughly exemplified as in the field of vision. As the dominant sensory modality of man, vision is one of the major keys to our mastery of the environment, to our understanding and control of the objects which surround us. If we wish to created robots capable of performing complex manipulative tasks in a changing environment, we must surely endow them with (among other things) adequate visual powers. How can we set about designing such flexible and adaptive robots? In designing them, can we make use of our rapidly growing knowledge of the human brain, and if so, how at the same time, can our experiences in designing artificial vision systems help us to understand how the brain analyzes visual information?

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Integration of inputs by cortical neurons provides the basis for the complex information processing performed in the cerebral cortex. Here, we propose a new analytic framework for understanding integration within cortical neuronal receptive fields. Based on the synaptic organization of cortex, we argue that neuronal integration is a systems--level process better studied in terms of local cortical circuitry than at the level of single neurons, and we present a method for constructing self-contained modules which capture (nonlinear) local circuit interactions. In this framework, receptive field elements naturally have dual (rather than the traditional unitary influence since they drive both excitatory and inhibitory cortical neurons. This vector-based analysis, in contrast to scalarsapproaches, greatly simplifies integration by permitting linear summation of inputs from both "classical" and "extraclassical" receptive field regions. We illustrate this by explaining two complex visual cortical phenomena, which are incompatible with scalar notions of neuronal integration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method is presented for the visual analysis of objects by computer. It is particularly well suited for opaque objects with smoothly curved surfaces. The method extracts information about the object's surface properties, including measures of its specularity, texture, and regularity. It also aids in determining the object's shape. The application of this method to a simple recognition task ??e recognition of fruit ?? discussed. The results on a more complex smoothly curved object, a human face, are also considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a small set of images spanning a large range of mouth shapes. The visemes are acquired from a recorded visual corpus of a human subject which is specifically designed to elicit one instantiation of each viseme. Using optical flow methods, correspondence from every viseme to every other viseme is computed automatically. By morphing along this correspondence, a smooth transition between viseme images may be generated. A complete visual utterance is constructed by concatenating viseme transitions. Finally, phoneme and timing information extracted from a text-to-speech synthesizer is exploited to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a photorealistic talking face.