990 resultados para 3D localization
Resumo:
本文介绍了星球探测机器人视觉系统的标定方法,首先介绍了一种新的从视觉坐标系到车体坐标系的变换方法,随后给出了像机的模型,在像机参数优化过程中将三维重投影误差作为评价函数,利用遗传算法完成寻优过程,以保证估计出的像机参数全局最优。真实环境实验结果表明:该方法具有较高的空间定位精度。
Resumo:
外科手术计算机辅助导航即利用计算机图形图像技术对放射影像学资料进行处理 ,重建二维或三维的医学图像模型 ,同时结合各种空间定位技术 ,在医师的双眼、手术工具及患者的头部之间建立一个实时的环路 ,实现手术过程中器械位置的实时或准实时显示。我们综述了外科手术计算机辅助导航系统的发展历史和研究现状 (重点阐述了其系统结构和关键技术 ,包括空间定位技术、图像处理与显示技术、系统配准技术、头部定位技术等 (最后给出了手术导航系统的发展趋势
Resumo:
微观尺度下的观测与操作是进行微纳米科学技术研究与实现、微纳米特性发现与利用、加工制造的重要技术手段。因此微纳米操作的关键技术问题主要包括两个方面:微纳米操作的观测成像,通过成像微纳米尺度下的物体可以被观测者所感知和观测;利用感知与观测信息指导微纳米尺度下机械操作控制。深度信息在计算机视觉的研究中占有着重要的地位,它使我们更好地理解现实世界中物体的3D关系。因此,利用深度信息实现3D测量逐渐被应用于微纳米操作的观测成像领域。工作域显微图像是唯一能反映被控目标体运动和位置的反馈信息,自然对象的深度信息也只能从此中获得。虽然很难自动地从这个平面图像中获得,但根据显微镜点扩散模型的光学特点,可以构造合理的模糊度判据,实现对象深度信息恢复。本文作者以微观尺度下的3D视觉观测与可视化为应用背景,通过分析几何光学成像中的各种成像规律。建立图像的模糊度判据,并利用该判据完成了微观尺度下的3D视觉观测与可视化。主要工作包括:(1)分析光学成像的基本原理,了解光学成像过程中聚焦和离焦成像现象发生条件和描述方法;分析图像清晰/模糊程度与景物深度变化之间的关系规律,进而给出基于光学图像信息的微观景物深度测量理论依据;(2)结合序列图像的清晰/模糊程度变化规律,分析不同测度算子对于清晰/模糊程度响应的灵敏度与适应性;提出建立适宜的模糊测度算子方法。(3)基于模糊测度算子和模糊化测度分布模型,提出建立微观尺度下的显微视觉图像与实际景物的模糊度-深度关系模型的获取实验方法。设计实验系统与实验方法,完成微观3D视觉观测;(4)通过基于模糊化测度的微观景物深度信息获取研究,提出微观景物的3D重建方法,实现微观尺度下的3D重建及其可视化方法,完成实验验证。本文就微纳米技术研究中的显微成像离焦现象进行了分析,给出了建立基于模糊测度的微米尺度下离焦度与景物深度信息关系的方法;分析了不同梯度算子所具有的不同模糊测度响应;并以实验验证了利用这种模糊测度可以对微观尺度下的景物进行深度信息获取,并且利用深度信息进行3D重建。
Resumo:
Structure from motion often refers to the computation of 3D structure from a matched sequence of images. However, a depth map of a surface is difficult to compute and may not be a good representation for storage and recognition. Given matched images, I will first show that the sign of the normal curvature in a given direction at a given point in the image can be computed from a simple difference of slopes of line-segments in one image. Using this result, local surface patches can be classified as convex, concave, parabolic (cylindrical), hyperbolic (saddle point) or planar. At the same time the translational component of the optical flow is obtained, from which the focus of expansion can be computed.
Resumo:
We consider the problem of matching model and sensory data features in the presence of geometric uncertainty, for the purpose of object localization and identification. The problem is to construct sets of model feature and sensory data feature pairs that are geometrically consistent given that there is uncertainty in the geometry of the sensory data features. If there is no geometric uncertainty, polynomial-time algorithms are possible for feature matching, yet these approaches can fail when there is uncertainty in the geometry of data features. Existing matching and recognition techniques which account for the geometric uncertainty in features either cannot guarantee finding a correct solution, or can construct geometrically consistent sets of feature pairs yet have worst case exponential complexity in terms of the number of features. The major new contribution of this work is to demonstrate a polynomial-time algorithm for constructing sets of geometrically consistent feature pairs given uncertainty in the geometry of the data features. We show that under a certain model of geometric uncertainty the feature matching problem in the presence of uncertainty is of polynomial complexity. This has important theoretical implications by demonstrating an upper bound on the complexity of the matching problem, an by offering insight into the nature of the matching problem itself. These insights prove useful in the solution to the matching problem in higher dimensional cases as well, such as matching three-dimensional models to either two or three-dimensional sensory data. The approach is based on an analysis of the space of feasible transformation parameters. This paper outlines the mathematical basis for the method, and describes the implementation of an algorithm for the procedure. Experiments demonstrating the method are reported.
Resumo:
We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a two-layer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, we trained the network to recognise ten objects from different viewpoints. The training process led to the emergence of compact representations of the specific input views. When tested on novel views of the same objects, the network exhibited a substantial generalisation capability. In simulated psychophysical experiments, the network's behavior was qualitatively similar to that of human subjects.
Resumo:
A polynomial time algorithm (pruned correspondence search, PCS) with good average case performance for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3D objects from a single 2D image, is presented. Efficient verification algorithms, based on a linear representation of location constraints, are given for the case of affine transformations among vector spaces and for the case of rigid 2D and 3D transformations with scale. Some preliminary experiments suggest that PCS is a practical algorithm. Its similarity to existing correspondence based algorithms means that a number of existing techniques for speedup can be incorporated into PCS to improve its performance.
Resumo:
Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. We distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and corresponding points in the nearest view. (Computing this measure is equivalent to solving the exterior orientation calibration problem.) In this paper we introduce a different type of metrics: transformation metrics. These metrics penalize for the deformatoins applied to the object to produce the observed image. We present a transformation metric that optimally penalizes for "affine deformations" under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. For Euclidean image metric we offier a sub-optimal closed-form solution and an iterative scheme to compute the exact solution.
Resumo:
A method for localization and positioning in an indoor environment is presented. The method is based on representing the scene as a set of 2D views and predicting the appearances of novel views by linear combinations of the model views. The method is accurate under weak perspective projection. Analysis of this projection as well as experimental results demonstrate that in many cases it is sufficient to accurately describe the scene. When weak perspective approximation is invalid, an iterative solution to account for the perspective distortions can be employed. A simple algorithm for repositioning, the task of returning to a previously visited position defined by a single view, is derived from this method.
Resumo:
Model-based object recognition commonly involves using a minimal set of matched model and image points to compute the pose of the model in image coordinates. Furthermore, recognition systems often rely on the "weak-perspective" imaging model in place of the perspective imaging model. This paper discusses computing the pose of a model from three corresponding points under weak-perspective projection. A new solution to the problem is proposed which, like previous solutins, involves solving a biquadratic equation. Here the biquadratic is motivate geometrically and its solutions, comprised of an actual and a false solution, are interpreted graphically. The final equations take a new form, which lead to a simple expression for the image position of any unmatched model point.
Resumo:
This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.