985 resultados para Object vision


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, an active stereo vision-based learning approach is proposed for a robot to track, fixate and grasp an object in unknown environments. First, the functional mapping relationships between the joint angles of the active stereo vision system and the spatial representations of the object are derived and expressed in a three-dimensional workspace frame. Next, the self-adaptive resonance theory-based neural networks and the feedforward neural networks are used to learn the mapping relationships in a self-organized way. Then, the approach is verified by simulation using the models of an active stereo vision system which is installed in the end-effector of a robot. Finally, the simulation results confirm the effectiveness of the present approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Differential optical flow methods are widely used within the computer vision community. They are classified as being either local, as in the Lucas-Kanade method, or global, as in the Horn-Schunck technique. As the physical dynamics of an object is inherently coupled into the behavior of its image in the video stream, in this paper, we use such dynamic parameter information in calculating optical flow when tracking a moving object using a video stream. Indeed, we use a modified error function in the minimization that contains physical parameter information. Further, the refined estimates of optical flow is used for better estimation of the physical parameters of the object in the simultaneous estimation of optical flow and object state(SEOS).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis focuses on the enhancement of differential optical flow techniques. The framwork of differential optical flow has been built upon to improve object motion estimation within a video stream or image sequence. This augmentation comes in the form of a combined optical flow and object state estimation method (SEOS)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Investigates visual information that enables human to effectively guide their movement through the environment. This problem is fundamental to the study of human behaviour, since survival is contingent upon the acquisition of resources that lie in different locations throughout the environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vision-based tracking of an object using perspective projection inherently results in non-linear measurement equations in the Cartesian coordinates. The underlying object kinematics can be modelled by a linear system. In this paper we introduce a measurement conversion technique that analytically transforms the non-linear measurement equations obtained from a stereo-vision system into a system of linear measurement equations.We then design a robust linear filter around the converted measurement system. The state estimation error of the proposed filter is bounded and we provide a rigorous theoretical analysis of this result. The performance of the robust filter developed in this paper is demonstrated via computer simulation and via practical experimentation using a robotic manipulator as a target. The proposed filter is shown to outperform the extended Kalman filter (EKF).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes an investigation into the use of parametric 2D models describing the movement of edges for the determination of possible 3D shape and hence function of an object. An assumption of this research is that the camera can foveate and track particular features. It is argued that simple 2D analytic descriptions of the movement of edges can infer 3D shape while the camera is moved. This uses an advantage of foveation i.e. the problem becomes object centred. The problem of correspondence for numerous edge points is overcome by the use of a tree based representation for the competing hypotheses. Numerous hypothesis are maintained simultaneously and it does not rely on a single kinematic model which assumes constant velocity or acceleration. The numerous advantages of this strategy are described.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently, a simple yet powerful branch-and-bound method called Efficient Subwindow Search (ESS) was developed to speed up sliding window search in object detection. A major drawback of ESS is that its computational complexity varies widely from O(n2) to O(n4) for n × n matrices. Our experimental experience shows that the ESS's performance is highly related to the optimal confidence levels which indicate the probability of the object's presence. In particular, when the object is not in the image, the optimal subwindow scores low and ESS may take a large amount of iterations to converge to the optimal solution and so perform very slow. Addressing this problem, we present two significantly faster methods based on the linear-time Kadane's Algorithm for 1D maximum subarray search. The first algorithm is a novel, computationally superior branchand- bound method where the worst case complexity is reduced to O(n3). Experiments on the PASCAL VOC 2006 data set demonstrate that this method is significantly and consistently faster (approximately 30 times faster on average) than the original ESS. Our second algorithm is an approximate algorithm based on alternating search, whose computational complexity is typically O(n2). Experiments shows that (on average) it is 30 times faster again than our first algorithm, or 900 times faster than ESS. It is thus wellsuited for real time object detection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traditional methods of object recognition are reliant on shape and so are very difficult to apply in cluttered, wideangle and low-detail views such as surveillance scenes. To address this, a method of indirect object recognition is proposed, where human activity is used to infer both the location and identity of objects. No shape analysis is necessary. The concept is dubbed 'interaction signatures', since the premise is that a human will interact with objects in ways characteristic of the function of that object - for example, a person sits in a chair and drinks from a cup. The human-centred approach means that recognition is possible in low-detail views and is largely invariant to the shape of objects within the same functional class. This paper implements a Bayesian network for classifying region patches with object labels, building upon our previous work in automatically segmenting and recognising a human's interactions with the objects. Experiments show that interaction signatures can successfully find and label objects in low-detail views and are equally effective at recognising test objects that differ markedly in appearance from the training objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Without the ability to foveate on and maintain foveation, active vision for applications such as surveillance, object recognition and object tracking are difficult to build. Although foveation in cartesian coordinates is being actively pursued by many, multi-resolution high accuracy foveation in log polar space has not been given much attention. This paper addresses the use of foveation to track a single object as well as multiple objects for a simulated space variant active vision system. Complex logarithmic mapping is chosen firstly because it provides high resolution and wide angle viewing. Secondly, the spatially variant structure of log polar space leads to an object increasing in size as it moves towards the fovea. This is important as we know which object is closer to the fovea at any instant in time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a fuzzy linear regression (FLR) model integrated with a genetic algorithm (GA) is proposed. The proposed GA-FLR model is applied to modeling of a stereo vision system. A set of empirical data from stereo vision object measurement is collected based on the full factorial design technique. Three regression models, namely ordinary least-squares regression (OLS), FLR, and GA-FLR, are developed, and with their performances compared. The results show that the proposed GA-FLR model performs better than OLS and FLR in modeling of a stereo vision system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a vision‐based autonomous move‐to‐grasp approach for a compact mobile manipulator under some low and small environments. The visual information of specified object with a radial symbol and an overhead colour block is extracted from two CMOS cameras in an embedded way. Furthermore, the mobile platform and the postures of the manipulator are adjusted continuously by vision‐based control, which drives the mobile manipulator approaching the object. When the mobile manipulator is sufficiently close to the object, only the manipulator moves to grasp the object based on the incremental movement with its head end centre of the end‐effector conforming to a Bezier curve. The effectiveness of the proposed approach is verified by experiments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sparse representation has been introduced to address many recognition problems in computer vision. In this paper, we propose a new framework for object categorization based on sparse representation of local features. Unlike most of previous sparse coding based methods in object classification that only use sparse coding to extract high-level features, the proposed method incorporates sparse representation and classification into a unified framework. Therefore, it does not need a further classifier. Experimental results show that the proposed method achieved better or comparable accuracy than the well known bag-of-features representation with various classifiers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study is to prove the convergence of the simultaneous estimation of the optical flow and object state (SEOS) method. The SEOS method utilizes dynamic object parameter information when calculating optical flow in tracking a moving object within a video stream. Optical flow estimation for the SEOS method requires the minimization of an error function containing the object's physical parameter data. When this function is discretized, the Euler-Lagrange equations form a system of linear equations. The system is arranged such that its property matrix is positive definite symmetric, proving the convergence of the Gauss-Seidel iterative methods. The system of linear equations produced by SEOS can alternatively be resolved by Jacobi iterative schemes. The positive definite symmetric property is not sufficient for Jacobi convergence. The convergence of SEOS for a block diagonal Jacobi is proved by analysing the Euclidean norm of the Jacobi matrix. In this paper, we also investigate the use of SEOS for tracking individual objects within a video sequence. The illustrations provided show the effectiveness of SEOS for localizing objects within a video sequence and generating optical flow results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object segmentation is widely recognized as one of the most challenging problems in computer vision. One major problem of existing methods is that most of them are vulnerable to the cluttered background. Moreover, human intervention is often required to specify foreground/background priors, which restricts the usage of object segmentation in real-world scenario. To address these problems, we propose a novel approach to learn complementary saliency priors for foreground object segmentation in complex scenes. Different from existing saliency-based segmentation approaches, we propose to learn two complementary saliency maps that reveal the most reliable foreground and background regions. Given such priors, foreground object segmentation is formulated as a binary pixel labelling problem that can be efficiently solved using graph cuts. As such, the confident saliency priors can be utilized to extract the most salient objects and reduce the distraction of cluttered background. Extensive experiments show that our approach outperforms 16 state-of-the-art methods remarkably on three public image benchmarks.