941 resultados para Enunciation scene
Resumo:
Currently,one of the important research areas in Spatial updating is the role of external (for instance visual) and internal (for instance proprioceptive or vestibular) information in spatial updating of scene recognition. Our study uses the paradigm of classic spatial updating research and the experimental design of investigation of Burgess(2004),first, we will explore the concrete influence of locomotion on scene recognition in real world; next, we will use virtual reality technology, which can control many spatial learning parameters and exclude the influence of extra irrelevant variables, to explore the influence of pure locomotion without visual cue on scene recognition, and furthermore, we will explore whether the ability of spatial updating can be transferred to new situations in a short period of time and compare the result pattern in real word with that in virtual reality to test the validity of virtual reality technology in spatial updating of scene recognition research. The main results of this paper can be summarized as follows: 1. In real world, we found two effects: the spatial updating effect and the viewpoint dependent effect, this result indicated that the spatial updating effect based on locomotion does not eliminate the viewpoint dependent effect during the scene recognition process in physical environment. 2. In virtual reality environment, we still found two effects: the spatial updating effect and the viewpoint dependent effect, this result showed us that the spatial updating effect based on locomotion does not eliminate the viewpoint dependent effect during the scene recognition process in virtual reality environment either. 3. The spatial updating effect based on locomotion plays double role in scene recognition: When subjects were tested in different viewpoint, spatial updating based on locomotion promoted scene recognition; while subjected were tested in same viewpoint, spatial updating based on locomotion had a negative influence on scene recognition, these results show us that spatial updating based on locomotion is automated and can not be ignored. 4. The ability of spatial updating can be transferred to new situations in a short period of time , and the experiment in the immersed virtual reality environment got the same result pattern with that in the physical environment, suggesting VR technology is a very effective method to do research on spatial updating of the scene recognition studies. 5. This study about scene recognition provides evidence to double system model of spatial updating in the immersed virtual reality environment.
Resumo:
Six experiments tested how headings of objects in scenes influenced the construction for the intrinsic frame of reference under different structure and viewpoint amount conditions. In Experiment 1 and 2, participants stood at 0 degree and learned an asymmetrical scene and a symmetrical scene that were composed by balls with no apparent headings separately. In Experiment 3, 4, 5 and 6, toys with apparent headings were used and they all faced the 315 degree of the scene. In Experiment 3 and 4, participants stood at 0 degree and learned an asymmetrical scene and a symmetrical scene that were composed by toys separately. In Experiment 5 and 6, participants stood at 0 and 315 degree and learned an asymmetrical scene and a symmetrical scene that were composed by toys separately. After learning, participants needed to finish triplet recognition tasks in all the experiments. The dependent measures were response latency and accuracy. The correct response latencies to the targets were analyzed by ANOVA. Accuracy was used to filter data and analyzed in an ANOVA in some experiments as a reference. Results indicate that headings of objects in scenes influence the pattern for intrinsic frame of reference. The structure of scene affects the acting mechanism of heading, but the amount of viewpoints does not have this effect. If the objects in scenes have no apparent headings, there will be viewpoint dependent effect and the advantage of symmetry axis as intrinsic axis in triplet recognition tasks. If the objects in scenes have apparent headings, people’s spatial memory pattern will be affected by objects’ headings. If the heading of objects (315 degree) is not parallel to the viewpoint (0 degree) in an asymmetrical scene, people will be inclined to represent the scene from the heading of objects but not from the viewpoint. As a result, the viewpoint dependent effect will disappear, and there is significant advantage for the triplets presented from heading of objects. If the heading of objects is not parallel to the symmetry axis in a symmetrical scene, people will represent the scene not only according to the symmetry axis as intrinsic axis, but also according to the heading of objects. As a result, the significant advantage for symmetry axis as intrinsic axis in triplet recognition tasks will disappear but there will be still a tendency. By contrast, the effect for the headings of objects is more significant in asymmetrical scenes than that in symmetrical scenes.
Resumo:
Color has an unresolved role in the rapid process of natural scene. The temporal changes of the color effect might partly account for the debates. Besides, the distinction of localized and unlocalized information has not been addressed directly in these color studies. Here we present two experiments that investigate whether color contributes to categorization in a briefly flashed natural image and also whether it is mediated by time and low-level information. By controlling the interval between target and mask stimuli, Experiment 1 tested the hypothesis that colors could facilitate in the early stage of scene perception and the effect would decay in later processing. Experiment 2 examined how the randomization of local phase information influenced the color’s advantage over gray. Together, the results suggest that color does enhance natural scene categorization at short exposure time. Furthermore, results imply that effect of color is stable between 12 and120ms, and is not accounted by showing the structures organized by localized information. Therefore,we concluded that color always make effect in the process of rapid scene categorization, and do not depend on localized information. Thus, the present study is an attempt to fill the gap in previous research; its results is an contribution to deeper understanding of the role of color in natural scene perception.
Resumo:
Interface has been becoming a more significant element today which influences the development of shopping on-line greatly. But in practice the attention arisen from society and study made are quite inadequate. Under this circumstance, I focus my study on the purpose of improving understanding of the engineering psychological factors, which definitely will play a crucial role in shopping on-line representation in future, and of the relations between them through the following experimental research. I hope it can give a basic reference to the practical application of shopping on-line representation pattern and continuous study. In current thesis, an analysis was made on the basis of engineering psychology principles from three aspects, i.e. person (users), task and information environment. It was considered that system overview and information behavior model would have great impact on the activities of users on the web and that representation pattern of information system would affect the forming of system overview and behavior pattern and then further after the performances of users in information system. Based on above-mentioned statement, a three-dimensional conceptual model was presented which demonstrates the relations between the crucial factors, which are media representation pattern, system hierarchy and objects in information unit. Thereafter, eight study hypothesis, which are about engineering psychological factors of virtual reality (VR) representation in shopping on-line system, was taken out and four experiments were followed up to testify the hypothesis. -In experiment one, a research was made to study how the three kinds of single media representation pattern influence the forming of system overview and information behavior from the point view of task performance, operating error, overall satisfactory and mental workload etc. -In experiment two, a study of how the combined media representation pattern of system hierarchy influences users' behavior was carried out. -In experiment three, a study of the hierarchy structure feature of VR representation pattern and the tendency of its width and depth to the effects of system behavior was made. -In experiment four, a study of the location relations between different parties in VR scene (information unit) was made. The result is as follows: -During structure dimensional state: Width-increasing caused more damage to the speed of users than depth-increasing in VR representation pattern. Although the performance of subjects was quite slow in wider environment, yet the percentage rate of causing errors was in lowest level. -During hierarchy representation pattern: 1. Between the representation patterns of the three media, no significant differences was found in terms of the speed of fulfilling the task, error rate, satisfactory, mental workload etc. But the pattern with figure- aided gained the worst results on all of these aspects. 2. During primary stage of the task and the first level of the hierarchy, the speed of subjects' performance in VR pattern was slower than that in text pattern. While with developing of the task and going deeper level of the hierarchy, the speed of users' performance in VR representation pattern reached to the highest level. 3. Effects in VR representation pattern was better than that in text pattern in higher level of the system. The representation pattern in highest level has greatest impact on the performance of the system behavior, whereas results of the only VR representation in the middle part of hierarchy would be worst. 4. Activity error in single media representation pattern was more than that in combined media representation pattern. 5. Individual differences among subjects had effects on the representation pattern of the system. During VR environment, behavior tendency of party A had a significant negative correlation to the quantities of errors. -In VR-scene representation: Physical-distance and flash influenced the subjects' task performance greatly, while psychological-distance has no outstanding impact. Subjects' accurate rate of performing increased if objects with same relation were in the same structure position, in the state of close psychological-distance or if the object target flashed (not reliable). Although the article limits the topic only on the present-existing questions and analysis of shopping-on-line, as a matter of fact, it can also apply for other relevant purposes on the web. While the study of this article only gives its emphasis on the researching-task with definite goal, making no consideration of other task conditions and their relations with other navigation tools. So I hope it lay a good start to make continuous research in this areas.
Resumo:
We present a model for recovering the direction of heading of an observer who is moving relative to a scene that may contain self-moving objects. The model builds upon an algorithm proposed by Rieger and Lawton (1985), which is based on earlier work by Longuet-Higgens and Prazdny (1981). The algorithm uses velocity differences computed in regions of high depth variation to estimate the location of the focus of expansion, which indicates the observer's heading direction. We relate the behavior of the proposed model to psychophysical observations regarding the ability of human observers to judge their heading direction, and show how the model can cope with self-moving objects in the environment. We also discuss this model in the broader context of a navigational system that performs tasks requiring rapid sensing and response through the interaction of simple task-specific routines.
Resumo:
A method for localization and positioning in an indoor environment is presented. The method is based on representing the scene as a set of 2D views and predicting the appearances of novel views by linear combinations of the model views. The method is accurate under weak perspective projection. Analysis of this projection as well as experimental results demonstrate that in many cases it is sufficient to accurately describe the scene. When weak perspective approximation is invalid, an iterative solution to account for the perspective distortions can be employed. A simple algorithm for repositioning, the task of returning to a previously visited position defined by a single view, is derived from this method.
Resumo:
In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct method that cuts through the computations of camera transformation, scene structure and epipolar geometry. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in re-projection tasks.
Resumo:
Visibility constraints can aid the segmentation of foreground objects observed with multiple range images. In our approach, points are defined as foreground if they can be determined to occlude some {em empty space} in the scene. We present an efficient algorithm to estimate foreground points in each range view using explicit epipolar search. In cases where the background pattern is stationary, we show how visibility constraints from other views can generate virtual background values at points with no valid depth in the primary view. We demonstrate the performance of both algorithms for detecting people in indoor office environments.
Resumo:
We present a unifying framework in which "object-independent" modes of variation are learned from continuous-time data such as video sequences. These modes of variation can be used as "generators" to produce a manifold of images of a new object from a single example of that object. We develop the framework in the context of a well-known example: analyzing the modes of spatial deformations of a scene under camera movement. Our method learns a close approximation to the standard affine deformations that are expected from the geometry of the situation, and does so in a completely unsupervised (i.e. ignorant of the geometry of the situation) fashion. We stress that it is learning a "parameterization", not just the parameter values, of the data. We then demonstrate how we have used the same framework to derive a novel data-driven model of joint color change in images due to common lighting variations. The model is superior to previous models of color change in describing non-linear color changes due to lighting.
Resumo:
Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.
Resumo:
In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach proposed here is to let the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene. We illustrate the use of this representation for encoding shape information. We show how this representation has appealing properties such as locality and slow variation across space and scale. These properties provide a way of improving shape estimates coming from other sources of information like stereo.
Resumo:
While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.
Resumo:
This article describes a model for including scene/context priors in attention guidance. In the proposed scheme, visual context information can be available early in the visual processing chain, in order to modulate the saliency of image regions and to provide an efficient short cut for object detection and recognition. The scene is represented by means of a low-dimensional global description obtained from low-level features. The global scene features are then used to predict the probability of presence of the target object in the scene, and its location and scale, before exploring the image. Scene information can then be used to modulate the saliency of image regions early during the visual processing in order to provide an efficient short cut for object detection and recognition.
Resumo:
Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two-dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.
Resumo:
This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.