987 resultados para Head pose estimation
Resumo:
Visual sea-floor mapping is a rapidly growing application for Autonomous Underwater Vehicles (AUVs). AUVs are well-suited to the task as they remove humans from a potentially dangerous environment, can reach depths human divers cannot, and are capable of long-term operation in adverse conditions. The output of sea-floor maps generated by AUVs has a number of applications in scientific monitoring: from classifying coral in high biological value sites to surveying sea sponges to evaluate marine environment health.
Resumo:
Stereo visual odometry has received little investigation in high altitude applications due to the generally poor performance of rigid stereo rigs at extremely small baseline-to-depth ratios. Without additional sensing, metric scale is considered lost and odometry is seen as effective only for monocular perspectives. This paper presents a novel modification to stereo based visual odometry that allows accurate, metric pose estimation from high altitudes, even in the presence of poor calibration and without additional sensor inputs. By relaxing the (typically fixed) stereo transform during bundle adjustment and reducing the dependence on the fixed geometry for triangulation, metrically scaled visual odometry can be obtained in situations where high altitude and structural deformation from vibration would cause traditional algorithms to fail. This is achieved through the use of a novel constrained bundle adjustment routine and accurately scaled pose initializer. We present visual odometry results demonstrating the technique on a short-baseline stereo pair inside a fixed-wing UAV flying at significant height (~30-100m).
Resumo:
Next-generation autonomous underwater vehicles (AUVs) will be required to robustly identify underwater targets for tasks such as inspection, localization, and docking. Given their often unstructured operating environments, vision offers enormous potential in underwater navigation over more traditional methods; however, reliable target segmentation often plagues these systems. This paper addresses robust vision-based target recognition by presenting a novel scale and rotationally invariant target design and recognition routine based on self-similar landmarks that enables robust target pose estimation with respect to a single camera. These algorithms are applied to an AUV with controllers developed for vision-based docking with the target. Experimental results show that the system performs exceptionally on limited processing power and demonstrates how the combined vision and controller system enables robust target identification and docking in a variety of operating conditions.
Resumo:
This paper presents a pose estimation approach that is resilient to typical sensor failure and suitable for low cost agricultural robots. Guiding large agricultural machinery with highly accurate GPS/INS systems has become standard practice, however these systems are inappropriate for smaller, lower-cost robots. Our positioning system estimates pose by fusing data from a low-cost global positioning sensor, low-cost inertial sensors and a new technique for vision-based row tracking. The results first demonstrate that our positioning system will accurately guide a robot to perform a coverage task across a 6 hectare field. The results then demonstrate that our vision-based row tracking algorithm improves the performance of the positioning system despite long periods of precision correction signal dropout and intermittent dropouts of the entire GPS sensor.
Resumo:
The ability to measure surface temperature and represent it on a metrically accurate 3D model has proven applications in many areas such as medical imaging, building energy auditing, and search and rescue. A system is proposed that enables this task to be performed with a handheld sensor, and for the first time with results able to be visualized and analyzed in real-time. A device comprising a thermal-infrared camera and range sensor is calibrated geometrically and used for data capture. The device is localized using a combination of ICP and video-based pose estimation from the thermal-infrared video footage which is shown to reduce the occurrence of failure modes. Furthermore, the problem of misregistration which can introduce severe distortions in assigned surface temperatures is avoided through the use of a risk-averse neighborhood weighting mechanism. Results demonstrate that the system is more stable and accurate than previous approaches, and can be used to accurately model complex objects and environments for practical tasks.
Resumo:
Sensing the mental, physical and emotional demand of a driving task is of primary importance in road safety research and for effectively designing in-vehicle information systems (IVIS). Particularly, the need of cars capable of sensing and reacting to the emotional state of the driver has been repeatedly advocated in the literature. Algorithms and sensors to identify patterns of human behavior, such as gestures, speech, eye gaze and facial expression, are becoming available by using low cost hardware: This paper presents a new system which uses surrogate measures such as facial expression (emotion) and head pose and movements (intention) to infer task difficulty in a driving situation. 11 drivers were recruited and observed in a simulated driving task that involved several pre-programmed events aimed at eliciting emotive reactions, such as being stuck behind slower vehicles, intersections and roundabouts, and potentially dangerous situations. The resulting system, combining face expressions and head pose classification, is capable of recognizing dangerous events (such as crashes and near misses) and stressful situations (e.g. intersections and way giving) that occur during the simulated drive.
Resumo:
We propose and evaluate a novel methodology to identify the rolling shutter parameters of a real camera. We also present a model for the geometric distortion introduced when a moving camera with a rolling shutter views a scene. Unlike previous work this model allows for arbitrary camera motion, including accelerations, is exact rather than a linearization and allows for arbitrary camera projection models, for example fisheye or panoramic. We show the significance of the errors introduced by a rolling shutter for typical robot vision problems such as structure from motion, visual odometry and pose estimation.
Resumo:
We contribute an empirically derived noise model for the Kinect sensor. We systematically measure both lateral and axial noise distributions, as a function of both distance and angle of the Kinect to an observed surface. The derived noise model can be used to filter Kinect depth maps for a variety of applications. Our second contribution applies our derived noise model to the KinectFusion system to extend filtering, volumetric fusion, and pose estimation within the pipeline. Qualitative results show our method allows reconstruction of finer details and the ability to reconstruct smaller objects and thinner surfaces. Quantitative results also show our method improves pose estimation accuracy. © 2012 IEEE.
Resumo:
This thesis explored the utility of long-range stereo visual odometry for application on Unmanned Aerial Vehicles. Novel parameterisations and initialisation routines were developed for the long-range case of stereo visual odometry and new optimisation techniques were implemented to improve the robustness of visual odometry in this difficult scenario. In doing so, the applications of stereo visual odometry were expanded and shown to perform adequately in situations that were previously unworkable.
Resumo:
This paper presents a new online multi-classifier boosting algorithm for learning object appearance models. In many cases the appearance model is multi-modal, which we capture by training and updating multiple strong classifiers. The proposed algorithm jointly learns the classifiers and a soft partitioning of the input space, defining an area of expertise for each classifier. We show how this formulation improves the specificity of the strong classifiers, allowing simultaneous location and pose estimation in a tracking task. The proposed online scheme iteratively adapts the classifiers during tracking. Experiments show that the algorithm successfully learns multi-modal appearance models during a short initial training phase, subsequently updating them for tracking an object under rapid appearance changes. © 2010 IEEE.
Resumo:
Opengazer is an open source application that uses an ordinary webcam to estimate head pose, facial gestures, or the direction of your gaze. This information can then be passed to other applications. For example, used in conjunction with Dasher, opengazer allows you to write with your eyes. Opengazer aims to be a low-cost software alternative to commercial hardware-based eye trackers. The first version of Opengazer was developed by Piotr Zieliński, supported by Samsung and the Gatsby Charitable Foundation. Research and development for Opengazer has been continued by Emli-Mari Nel, and was supported until 2012 by the European Commission in the context of the AEGIS project, and also by the Gatsby Charitable Foundation.
Resumo:
Vision trackers have been proposed as a promising alternative for tracking at large-scale, congested construction sites. They provide the location of a large number of entities in a camera view across frames. However, vision trackers provide only two-dimensional (2D) pixel coordinates, which are not adequate for construction applications. This paper proposes and validates a method that overcomes this limitation by employing stereo cameras and converting 2D pixel coordinates to three-dimensional (3D) metric coordinates. The proposed method consists of four steps: camera calibration, camera pose estimation, 2D tracking, and triangulation. Given that the method employs fixed, calibrated stereo cameras with a long baseline, appropriate algorithms are selected for each step. Once the first two steps reveal camera system parameters, the third step determines 2D pixel coordinates of entities in subsequent frames. The 2D coordinates are triangulated on the basis of the camera system parameters to obtain 3D coordinates. The methodology presented in this paper has been implemented and tested with data collected from a construction site. The results demonstrate the suitability of this method for on-site tracking purposes.
Resumo:
在工业、航天、医疗等许多领域中,经常需要测量两个空间物体坐标系间的相对位姿。位姿测量方法一般包括声纳或激光测距、GPS、视觉方法等多种方法,其中视觉方法由于其信息量大、处理速度快等特点发挥了越来越重要的作用。在基于模型的单目视觉位姿测量中,位姿测量系统的测量精度与摄像机标定误差、图像坐标的检测误差、目标模型的测量误差等许多因素有关,而摄像机标定误差是影响位姿测量精度的一个主要误差来源,是测量系统内在的不可避免的误差。 实际应用中,提高整个系统的测量精度是三维视觉测量的一项重要任务,而摄像机标定精度的提高将有效地提高位姿测量精度。针对传统的摄像机标定方法在实际工程应用中存在的问题,以提高基于模型的单目视觉位姿测量系统的测量精度为目标,利用理论推导和仿真实验相结合的方法,首先通过从几何意义求解摄像机内外参数,深刻理解摄像机各参数的物理意义;然后在此基础上,重点研究摄像机标定空间的选择、标定参数误差与位姿测量误差之间以及标定参数误差与测量空间之间的关系分析;最后在理论研究的基础上,针对工程应用提出摄像机标定的相应策略,以提高整个系统的测量精度。 首先,在基于几何意义的摄像机内外参数求解方法研究中,从摄像机投影变换矩阵出发,通过几何思路求解摄像机内外参数;同时在此求解过程中,从几何意义推导得到描述透视投影变换的 矩阵必须满足的约束条件。这对深刻理解摄像机各参数的物理意义及其几何关系,对从直观几何方面分析摄像机参数对位姿精度的影响有着一定的意义。 其次,针对标定空间对位姿测量精度的影响,首先推导出标定空间与标定参数误差之间的关系,在此基础上,再给出标定参数误差和位姿测量结果误差之间的关系。实验结果表明:不管测试目标的成像范围如何,在满视场范围内标定时标定误差最小,从而可以得到更高的位姿测量精度。研究结果可为实际工程应用中测量系统的摄像机标定提供理论依据,对标定点的布置和摆放具有指导意义。 最后,在摄像机标定参数误差对位姿测量精度的影响研究中,通过理论推导结合仿真实验及几何解释说明,使用误差传播的方法,建立了位姿测量误差受摄像机标定参数误差影响的数学模型,分析了摄像机标定参数误差与位姿测量误差的关系,得到如下结论:测量距离方向位置精度主要受焦比误差和摄像机外参数光轴方向平移量误差的影响;姿态角精度主要受主点误差和摄像机外参数旋转角误差的影响。在标定参数误差与测量空间之间的关系方面,摄像机内外参数的误差在不同测量距离有着不同的影响。近距离时,应主要考虑外参数光轴方向平移量误差的影响;而远距离时,应主要考虑内参数焦比误差的影响。该结论可以指导摄像机标定方法的选择,根据不同的情况采用不同的算法以满足实际精度需求,从而达到优化系统设计,提高整个定位系统性能的目的,对于指导视觉位姿测量系统的工程应用具有一定的意义。
Resumo:
位姿估计是计算机视觉的一个重要的研究内容,它在目标定位、摄像机标定、手眼系统、移动机器人及三维重建等领域有着广泛的应用。常用的确定目标位姿的视觉方法包括基于模型的单目视觉方法和双目视觉方法。目前,单目视觉方法大多采取点特征和线特征,这是因为点线特征的提取比较容易,而且具有数学上的简易性。在实现目标位姿测量时,像素量化引起的特征点的图像坐标噪声是不可避免的问题,它必将引起物体的位姿计算值的误差。同时,为了充分利用图像处理中提取的点线特征,我们希望能将这些特征在一个统一的数学模型下进行位姿估计。针对上述问题,本文取得了如下研究成果: 首先,采用双重四元数同时表示旋转和平移,从而将点线特征约束统一成二次型约束的形式。这种形式利于采用伪线性化方法简化问题,进而通过迭代算法求解位姿参数。 然后,为了在位姿估计中考虑到量化误差的影响,引入了EIV模型描述影响点线特征投影的量化误差。同时,给出了优化目标函数,提出了基于奇异值分解的迭代算法来估计位姿参数,并在迭代算法中通过增加一条收敛规则解决了实现过程中遇到的振荡问题,保证了算法的收敛。和传统的迭代算法相比,该算法具有以下两个特点:第一,受初始值的影响小,实验中采用随机初值依然可以收敛;第二,迭代过程不是采用传统的步长搜索,收敛速度快。此外,本文还研究了位姿参数估计值的统计特性,证明了估计值的无偏性。 最后,考虑到量化误差、点线特征数目和目标相对摄像机的距离等影响位姿估计结果的几个因素,本文设计了仿真实验和实际实验分别对点特征和线特征的位姿估计进行测试。结果表明:该算法可以应用于任意三个以上点特征和线特征的情况,受初值影响小,收敛快,提高了位姿估计结果的精度和鲁棒性。
Resumo:
基于旋转体的摄像机定位是单目合作目标定位领域中的涉及较少并且较为困难的一个问题,传统的基于点基元、直线基元及曲线基元的定位方法在用于旋转体定位过程中都存在相应的问题.文中设计了一种由4个相切椭圆构成的几何模型,该模型环绕于圆柱体表面,利用二次曲线的投影仍然是二次曲线的特性和椭圆的相应性质能够得到唯一确定模型位置的3个坐标点,从而将旋转体定位问题转化为P3P问题.在对P3P的解模式区域进行分析后,推导了根据模型上可视曲线的弯曲情况来确定P3P问题解模式的判别方法,并给出证明过程.仿真实验表明了这种模型定位方法的有效性.最后利用这个模型引导机械手完成目标定位的实验.