885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation
Resumo:
There are over 600,000 bridges in the US, and not all of them can be inspected and maintained within the specified time frame. This is because manually inspecting bridges is a time-consuming and costly task, and some state Departments of Transportation (DOT) cannot afford the essential costs and manpower. In this paper, a novel method that can detect large-scale bridge concrete columns is proposed for the purpose of eventually creating an automated bridge condition assessment system. The method employs image stitching techniques (feature detection and matching, image affine transformation and blending) to combine images containing different segments of one column into a single image. Following that, bridge columns are detected by locating their boundaries and classifying the material within each boundary in the stitched image. Preliminary test results of 114 concrete bridge columns stitched from 373 close-up, partial images of the columns indicate that the method can correctly detect 89.7% of these elements, and thus, the viability of the application of this research.
Resumo:
Only very few constructed facilities today have a complete record of as-built information. Despite the growing use of Building Information Modelling and the improvement in as-built records, several more years will be required before guidelines that require as-built data modelling will be implemented for the majority of constructed facilities, and this will still not address the stock of existing buildings. A technical solution for scanning buildings and compiling Building Information Models is needed. However, this is a multidisciplinary problem, requiring expertise in scanning, computer vision and videogrammetry, machine learning, and parametric object modelling. This paper outlines the technical approach proposed by a consortium of researchers that has gathered to tackle the ambitious goal of automating as-built modelling as far as possible. The top level framework of the proposed solution is presented, and each process, input and output is explained, along with the steps needed to validate them. Preliminary experiments on the earlier stages (i.e. processes) of the framework proposed are conducted and results are shown; the work toward implementation of the remainder is ongoing.
Resumo:
Among several others, the on-site inspection process is mainly concerned with finding the right design and specifications information needed to inspect each newly constructed segment or element. While inspecting steel erection, for example, inspectors need to locate the right drawings for each member and the corresponding specifications sections that describe the allowable deviations in placement among others. These information seeking tasks are highly monotonous, time consuming and often erroneous, due to the high similarity of drawings and constructed elements and the abundance of information involved which can confuse the inspector. To address this problem, this paper presents the first steps of research that is investigating the requirements of an automated computer vision-based approach to automatically identify “as-built” information and use it to retrieve “as-designed” project information for field construction, inspection, and maintenance tasks. Under this approach, a visual pattern recognition model was developed that aims to allow automatic identification of construction entities and materials visible in the camera’s field of view at a given time and location, and automatic retrieval of relevant design and specifications information.
Resumo:
Among several others, the on-site inspection process is mainly concerned with finding the right design and specifications information needed to inspect each newly constructed segment or element. While inspecting steel erection, for example, inspectors need to locate the right drawings for each member and the corresponding specifications sections that describe the allowable deviations in placement among others. These information seeking tasks are highly monotonous, time consuming and often erroneous, due to the high similarity of drawings and constructed elements and the abundance of information involved which can confuse the inspector. To address this problem, this paper presents the first steps of research that is investigating the requirements of an automated computer vision-based approach to automatically identify “as-built” information and use it to retrieve “as-designed” project information for field construction, inspection, and maintenance tasks. Under this approach, a visual pattern recognition model was developed that aims to allow automatic identification of construction entities and materials visible in the camera’s field of view at a given time and location, and automatic retrieval of relevant design and specifications information.
Resumo:
This paper presents a novel way to speed up the evaluation time of a boosting classifier. We make a shallow (flat) network deep (hierarchical) by growing a tree from decision regions of a given boosting classifier. The tree provides many short paths for speeding up while preserving the reasonably smooth decision regions of the boosting classifier for good generalisation. For converting a boosting classifier into a decision tree, we formulate a Boolean optimization problem, which has been previously studied for circuit design but limited to a small number of binary variables. In this work, a novel optimisation method is proposed for, firstly, several tens of variables i.e. weak-learners of a boosting classifier, and then any larger number of weak-learners by using a two-stage cascade. Experiments on the synthetic and face image data sets show that the obtained tree achieves a significant speed up both over a standard boosting classifier and the Fast-exit-a previously described method for speeding-up boosting classification, at the same accuracy. The proposed method as a general meta-algorithm is also useful for a boosting cascade, where it speeds up individual stage classifiers by different gains. The proposed method is further demonstrated for fast-moving object tracking and segmentation problems. © 2011 Springer Science+Business Media, LLC.
Resumo:
Advances in the development of computer vision, miniature Micro-Electro-Mechanical Systems (MEMS) and Wireless Sensor Network (WSN) offer intriguing possibilities that can radically alter the paradigms underlying existing methods of condition assessment and monitoring of ageing civil engineering infrastructure. This paper describes some of the outcomes of the European Science Foundation project "Micro-Measurement and Monitoring System for Ageing Underground Infrastructures (Underground M3)". The main aim of the project was to develop a system that uses a tiered approach to monitor the degree and rate of tunnel deterioration. The system comprises of (1) Tier 1: Micro-detection using advances in computer vision and (2) Tier 2: Micro-monitoring and communication using advances in MEMS and WSN. These potentially low-cost technologies will be able to reduce costs associated with end-of-life structures, which is essential to the viability of rehabilitation, repair and reuse. The paper describes the actual deployment and testing of these innovative monitoring tools in tunnels of London Underground, Prague Metro and Barcelona Metro. © 2012 Taylor & Francis Group.
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
The brain extracts useful features from a maelstrom of sensory information, and a fundamental goal of theoretical neuroscience is to work out how it does so. One proposed feature extraction strategy is motivated by the observation that the meaning of sensory data, such as the identity of a moving visual object, is often more persistent than the activation of any single sensory receptor. This notion is embodied in the slow feature analysis (SFA) algorithm, which uses “slowness” as an heuristic by which to extract semantic information from multi-dimensional time-series. Here, we develop a probabilistic interpretation of this algorithm showing that inference and learning in the limiting case of a suitable probabilistic model yield exactly the results of SFA. Similar equivalences have proved useful in interpreting and extending comparable algorithms such as independent component analysis. For SFA, we use the equivalent probabilistic model as a conceptual spring-board, with which to motivate several novel extensions to the algorithm.
Resumo:
Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied, since recordings are made using the same timebase, or time-stamp information is embedded in the video streams. Recordings using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry. In this paper, we propose a technique which exploits feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. Our method automatically selects the moving feature points in the two unsynchronized videos whose 2D trajectories can be best related, thereby helping to infer the synchronization index. We evaluate performance using a number of real recordings and show that synchronization can be achieved to within 1 sec, which is better than previous approaches. Copyright 2013 ACM.
Resumo:
针对经典形状上下文算法对物体关节相对位置变化敏感的缺点,提出一种基于剪影局部形状填充率的物体识别算法.该算法以物体不同的轮廓控制点为圆心,计算不同半径下物体剪影像素所占总像素的比例,即为控制点的局部形状填充率;将不同控制点、不同半径长度所计算的形状填充率数值构成一个特征矩阵,该矩阵反映了物体整个剪影的统计特性.通过不同数据库的实验结果表明,文中算法对物体的细节有很强的描述能力,对物体关节的相对位置不敏感,并且受剪影轮廓控制点数量影响小.
Resumo:
Watermarking aims to hide particular information into some carrier but does not change the visual cognition of the carrier itself. Local features are good candidates to address the watermark synchronization error caused by geometric distortions and have attracted great attention for content-based image watermarking. This paper presents a novel feature point-based image watermarking scheme against geometric distortions. Scale invariant feature transform (SIFT) is first adopted to extract feature points and to generate a disk for each feature point that is invariant to translation and scaling. For each disk, orientation alignment is then performed to achieve rotation invariance. Finally, watermark is embedded in middle-frequency discrete Fourier transform (DFT) coefficients of each disk to improve the robustness against common image processing operations. Extensive experimental results and comparisons with some representative image watermarking methods confirm the excellent performance of the proposed method in robustness against various geometric distortions as well as common image processing operations.
Resumo:
角点检测应用十分广泛,是许多计算机视觉任务的基础。本文提出了一种快速、高精度的角点检测算法,算法简单新颖,角点条件和角点响应函数设计独特。和以往不同的是:算法在设计上考虑的是角点的局部几何特征,使得处理的数据量大为减少,同时能够很好地保证检测精度等其他性能指标。通过和广泛使用的SUSAN算法、Harris算法在正确率、漏检、精度、抗噪声、计算复杂度等方面进行综合比较,结果表明该算法无论对人工合成图像还是对自然图像均具有良好的性能。
Resumo:
随着移动机器人应用范围的日益扩展,在动态、非结构化环境下提高其自主导航能力已经成为移动机器人研究领域迫切需要解决的问题。在机器人自主导航关键技术中,识别技术是最难解决、也是最急需解决的问题。视觉作为导航中的重要传感器,与其他传感器相比具有信息量大、重量轻便、功耗低等诸多优势,因此基于视觉的识别技术也被公认为最具潜力的研究方向。 本文以国防基础研究项目和中科院开放实验室基金项目为依托,以沈阳自动化所自主研发的“轮腿复合结构机器人”和“无人机”为实验平台,针对地面自主机器人和无人机自主导航中迫切需要解决的应用问题,有针对性的展开研究,旨在提高移动机器人在动态、非结构化环境下的适应能力。 本论文的主要内容如下: 首先,为了提高复杂环境下地面移动机器人的自主能力,本文提出了一种基于立体视觉的面向室外非结构化环境障碍物检测算法。文中首先给出了一种可以从V视差图(V-disparity image)中有效估计地面主视差(Main Ground Disparity, MGD)的方法。随后,我们利用由粗到精逐步判断的方式,来识别疑似障碍和最终障碍并对障碍进行定位。最后,该方法已在地面自主移动平台得到实际应用。通过在各种场景下的实验,验证了该方法的准确性和快速性。 其次,以无人机天际线识别为背景,提出了一种准确、实时的天际线识别算法,并由此估计姿态角。通过对天际线建立能量泛函模型,利用变分原理推出相应偏微分方程。在实际应用中出于对实时性的考虑,引入分段直线约束对该模型进行简化,然后利用由粗到精的思想识别天际线。具体做法是:首先,对图像预处理并垂直剖分,然后利用简化的水平直线模型对天际线进行粗识别,通过拟合获得天际线粗识别结果,最后在基于梯度和区域混合开曲线模型约束下精确识别天际线,并由此估计无人机滚动和俯仰姿态角。 第三,通过对红外机场跑道的目标特性进行分析,文中设计了一种新的基于1D Haar 小波的并行的红外图像分割算法的;然后,有针对性的对分割区域提取特征;最后,两种常用的识别方法,支持向量机(SVM)和投票法(Voting)被用于对疑似目标区域进行分类和识别。通过对实际视频和红外仿真图片的测试,验证了本文算法的快速性、可靠性和实时性,该算法每帧平均处理时间为30ms。 最后,针对无人机空中巡逻中对人群进行自动监控所遇到的问题,通过将此类问题简化为固定视角下人流密度监测问题,提出了一种全新的基于速度场估计的越线人流计数和区域内人流密度估计算法。 首先,该算法把越线的人流当成运动的流场,给出了一种有效估计1D速度场的运动估计模型;然后,通过对动态人流进行速度估计和积分,将越线人流的拼接成动态区域;最后,对各个动态区域提取面积和边缘信息,利用回归分析实现对人流密度估计。该方法与以往基于场景学习的方法不同,本文是一种基于角度的学习,因此便于实际应用。
Resumo:
论述了一种基于立体视觉的建模方法。该方法利用立体视觉系统在不同视点对景物观测所获得的局部三维几何模型,通过空间特征点匹配和坐标变换将局部模型融合,从而建立景物的完整描述。文章重点介绍了一种基于空间向量的坐标变换求解方法。
Resumo:
Many problems in early vision are ill posed. Edge detection is a typical example. This paper applies regularization techniques to the problem of edge detection. We derive an optimal filter for edge detection with a size controlled by the regularization parameter $\\ lambda $ and compare it to the Gaussian filter. A formula relating the signal-to-noise ratio to the parameter $\\lambda $ is derived from regularization analysis for the case of small values of $\\lambda$. We also discuss the method of Generalized Cross Validation for obtaining the optimal filter scale. Finally, we use our framework to explain two perceptual phenomena: coarsely quantized images becoming recognizable by either blurring or adding noise.