501 resultados para PASCAL Visual Object Classes (VOC)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is concerned with the unsupervised learning of object representations by fusing visual and motor information. The problem is posed for a mobile robot that develops its representations as it incrementally gathers data. The scenario is problematic as the robot only has limited information at each time step with which it must generate and update its representations. Object representations are refined as multiple instances of sensory data are presented; however, it is uncertain whether two data instances are synonymous with the same object. This process can easily diverge from stability. The premise of the presented work is that a robot's motor information instigates successful generation of visual representations. An understanding of self-motion enables a prediction to be made before performing an action, resulting in a stronger belief of data association. The system is implemented as a data-driven partially observable semi-Markov decision process. Object representations are formed as the process's hidden states and are coordinated with motor commands through state transitions. Experiments show the prediction process is essential in enabling the unsupervised learning method to converge to a solution - improving precision and recall over using sensory data alone.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present an unsupervised graph cut based object segmentation method using 3D information provided by Structure from Motion (SFM), called Grab- CutSFM. Rather than focusing on the segmentation problem using a trained model or human intervention, our approach aims to achieve meaningful segmentation autonomously with direct application to vision based robotics. Generally, object (foreground) and background have certain discriminative geometric information in 3D space. By exploring the 3D information from multiple views, our proposed method can segment potential objects correctly and automatically compared to conventional unsupervised segmentation using only 2D visual cues. Experiments with real video data collected from indoor and outdoor environments verify the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents practical vision-based collision avoidance for objects approximating a single point feature. Using a spherical camera model, a visual predictive control scheme guides the aircraft around the object along a conical spiral trajectory. Visibility, state and control constraints are considered explicitly in the controller design by combining image and vehicle dynamics in the process model, and solving the nonlinear optimization problem over the resulting state space. Importantly, range is not required. Instead, the principles of conical spiral motion are used to design an objective function that simultaneously guides the aircraft along the avoidance trajectory, whilst providing an indication of the appropriate point to stop the spiral behaviour. Our approach is aimed at providing a potential solution to the See and Avoid problem for unmanned aircraft and is demonstrated through a series.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The integration of separate, yet complimentary, cortical pathways appears to play a role in visual perception and action when intercepting objects. The ventral system is responsible for object recognition and identification, while the dorsal system facilitates continuous regulation of action. This dual-system model implies that empirically manipulating different visual information sources during performance of an interceptive action might lead to the emergence of distinct gaze and movement pattern profiles. To test this idea, we recorded hand kinematics and eye movements of participants as they attempted to catch balls projected from a novel apparatus that synchronised or de-synchronised accompanying video images of a throwing action and ball trajectory. Results revealed that ball catching performance was less successful when patterns of hand movements and gaze behaviours were constrained by the absence of advanced perceptual information from the thrower's actions. Under these task constraints, participants began tracking the ball later, followed less of its trajectory, and adapted their actions by initiating movements later and moving the hand faster. There were no performance differences when the throwing action image and ball speed were synchronised or de-synchronised since hand movements were closely linked to information from ball trajectory. Results are interpreted relative to the two-visual system hypothesis, demonstrating that accurate interception requires integration of advanced visual information from kinematics of the throwing action and from ball flight trajectory.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A robust visual tracking system requires an object appearance model that is able to handle occlusion, pose, and illumination variations in the video stream. This can be difficult to accomplish when the model is trained using only a single image. In this paper, we first propose a tracking approach based on affine subspaces (constructed from several images) which are able to accommodate the abovementioned variations. We use affine subspaces not only to represent the object, but also the candidate areas that the object may occupy. We furthermore propose a novel approach to measure affine subspace-to-subspace distance via the use of non-Euclidean geometry of Grassmann manifolds. The tracking problem is then considered as an inference task in a Markov Chain Monte Carlo framework via particle filtering. Quantitative evaluation on challenging video sequences indicates that the proposed approach obtains considerably better performance than several recent state-of-the-art methods such as Tracking-Learning-Detection and MILtrack.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objects in an environment are often encountered sequentially during spatial learning, forming a path along which object locations are experienced. The present study investigated the effect of spatial information conveyed through the path in visual and proprioceptive learning of a room-sized spatial layout, exploring whether different modalities differentially depend on the integrity of the path. Learning object locations along a coherent path was compared with learning them in a spatially random manner. Path integrity had little effect on visual learning, whereas learning with the coherent path produced better memory performance than random order learning for proprioceptive learning. These results suggest that path information has differential effects in visual and proprioceptive spatial learning, perhaps due to a difference in the way one establishes a reference frame for representing relative locations of objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present study was conducted to investigate whether ob- servers are equally prone to overlook any kinds of visual events in change blindness. Capitalizing on the finding from visual search studies that abrupt appearance of an object effectively captures observers' attention, the onset of a new object and the offset of an existing object were contrasted regarding their detectability when they occurred in a naturalistic scene. In an experiment, participants viewed a series of photograph pairs in which layouts of seven or eight objects were depicted. One object either appeared in or disappeared from the layout, and participants tried to detect this change. Results showed that onsets were detected more quickly than offsets, while they were detected with equivalent ac- curacy. This suggests that the primacy of onset over offset is a robust phenomenon that likely makes onsets more resistant to change blindness under natural viewing conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present study investigated how object locations learned separately are integrated and represented as a single spatial layout in memory. Two experiments were conducted in which participants learned a room-sized spatial layout that was divided into two sets of five objects. Results suggested that integration across sets was performed efficiently when it was done during initial encoding of the environment but entailed cost in accuracy when it was attempted at the time of memory retrieval. These findings suggest that, once formed, spatial representations in memory generally remain independent and integrating them into a single representation requires additional cognitive processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This fMRI study investigates how audiovisual integration differs for verbal stimuli that can be matched at a phonological level and nonverbal stimuli that can be matched at a semantic level. Subjects were presented simultaneously with one visual and one auditory stimulus and were instructed to decide whether these stimuli referred to the same object or not. Verbal stimuli were simultaneously presented spoken and written object names, and nonverbal stimuli were photographs of objects simultaneously presented with naturally occurring object sounds. Stimulus differences were controlled by including two further conditions that paired photographs of objects with spoken words and object sounds with written words. Verbal matching, relative to all other conditions, increased activation in a region of the left superior temporal sulcus that has previously been associated with phonological processing. Nonverbal matching, relative to all other conditions, increased activation in a right fusiform region that has previously been associated with structural and conceptual object processing. Thus, we demonstrate how brain activation for audiovisual integration depends on the verbal content of the stimuli, even when stimulus and task processing differences are controlled.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

By virtue of its widespread afferent projections, perirhinal cortex is thought to bind polymodal information into abstract object-level representations. Consistent with this proposal, deficits in cross-modal integration have been reported after perirhinal lesions in nonhuman primates. It is therefore surprising that imaging studies of humans have not observed perirhinal activation during visual-tactile object matching. Critically, however, these studies did not differentiate between congruent and incongruent trials. This is important because successful integration can only occur when polymodal information indicates a single object (congruent) rather than different objects (incongruent). We scanned neurologically intact individuals using functional magnetic resonance imaging (fMRI) while they matched shapes. We found higher perirhinal activation bilaterally for cross-modal (visual-tactile) than unimodal (visual-visual or tactile-tactile) matching, but only when visual and tactile attributes were congruent. Our results demonstrate that the human perirhinal cortex is involved in cross-modal, visual-tactile, integration and, thus, indicate a functional homology between human and monkey perirhinal cortices.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To identify and categorize complex stimuli such as familiar objects or speech, the human brain integrates information that is abstracted at multiple levels from its sensory inputs. Using cross-modal priming for spoken words and sounds, this functional magnetic resonance imaging study identified 3 distinct classes of visuoauditory incongruency effects: visuoauditory incongruency effects were selective for 1) spoken words in the left superior temporal sulcus (STS), 2) environmental sounds in the left angular gyrus (AG), and 3) both words and sounds in the lateral and medial prefrontal cortices (IFS/mPFC). From a cognitive perspective, these incongruency effects suggest that prior visual information influences the neural processes underlying speech and sound recognition at multiple levels, with the STS being involved in phonological, AG in semantic, and mPFC/IFS in higher conceptual processing. In terms of neural mechanisms, effective connectivity analyses (dynamic causal modeling) suggest that these incongruency effects may emerge via greater bottom-up effects from early auditory regions to intermediate multisensory integration areas (i.e., STS and AG). This is consistent with a predictive coding perspective on hierarchical Bayesian inference in the cortex where the domain of the prediction error (phonological vs. semantic) determines its regional expression (middle temporal gyrus/STS vs. AG/intraparietal sulcus).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper provides a preliminary analysis of an autonomous uncooperative collision avoidance strategy for unmanned aircraft using image-based visual control. Assuming target detection, the approach consists of three parts. First, a novel decision strategy is used to determine appropriate reference image features to track for safe avoidance. This is achieved by considering the current rules of the air (regulations), the properties of spiral motion and the expected visual tracking errors. Second, a spherical visual predictive control (VPC) scheme is used to guide the aircraft along a safe spiral-like trajectory about the object. Lastly, a stopping decision based on thresholding a cost function is used to determine when to stop the avoidance behaviour. The approach does not require estimation of range or time to collision, and instead relies on tuning two mutually exclusive decision thresholds to ensure satisfactory performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a 100 Hz monocular position based visual servoing system to control a quadrotor flying in close proximity to vertical structures approximating a narrow, locally linear shape. Assuming the object boundaries are represented by parallel vertical lines in the image, detection and tracking is achieved using Plücker line representation and a line tracker. The visual information is fused with IMU data in an EKF framework to provide fast and accurate state estimation. A nested control design provides position and velocity control with respect to the object. Our approach is aimed at high performance on-board control for applications allowing only small error margins and without a motion capture system, as required for real world infrastructure inspection. Simulated and ground-truthed experimental results are presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In design studio, sketching or visual thinking is part of processes that assist students to achieve final design solutions. At QUT’s First and Third Year industrial design studio classes we engage in a variety of teaching pedagogies from which we identify ‘Concept Bombs’ as an instrumental in the development of students’ visual thinking and reflective design process, and also as a vehicle to foster positive student engagement. Our ‘formula’: Concept Bombs are 20 minute design tasks focusing on rapid development of initial concept designs and free-hand sketching. Our experience and surveys tell us that students value intensive studio activities especially when combined with timely assessment and feedback. While conventional longer-duration design projects are essential for allowing students to engage with the full depth and complexity of the design process, short and intensive design activities introduce variety to the learning experience and enhance student engagement. This paper presents a comparative analysis of First and Third Year students’ Concept Bomb sketches to describe the types of design knowledge embedded in them, a discussion of limitations and opportunities of this pedagogical technique, as well as considerations for future development of studio based tasks of this kind as design pedagogies in the midst of current university education trends.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with constrained image-based visual servoing of circular and conical spiral motion about an unknown object approximating a single image point feature. Effective visual control of such trajectories has many applications for small unmanned aerial vehicles, including surveillance and inspection, forced landing (homing), and collision avoidance. A spherical camera model is used to derive a novel visual-predictive controller (VPC) using stability-based design methods for general nonlinear model-predictive control. In particular, a quasi-infinite horizon visual-predictive control scheme is derived. A terminal region, which is used as a constraint in the controller structure, can be used to guide appropriate reference image features for spiral tracking with respect to nominal stability and feasibility. Robustness properties are also discussed with respect to parameter uncertainty and additive noise. A comparison with competing visual-predictive control schemes is made, and some experimental results using a small quad rotor platform are given.