17 resultados para Visual Object Recognition

em Chinese Academy of Sciences Institutional Repositories Grid Portal


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A visual pattern recognition network and its training algorithm are proposed. The network constructed of a one-layer morphology network and a two-layer modified Hamming net. This visual network can implement invariant pattern recognition with respect to image translation and size projection. After supervised learning takes place, the visual network extracts image features and classifies patterns much the same as living beings do. Moreover we set up its optoelectronic architecture for real-time pattern recognition. (C) 1996 Optical Society of America

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Whether mice perceive the depth of space dependent on the visual size of object targets was explored when visual cues such as perspective and partial occlusion in space were excluded. A mouse was placed on a platform the height of which is adjustable. The platform located inside a box in which all other walls were dark exception its bottom through that light was projected as a sole visual cue. The visual object cue was composed of 4x4 grids to allow a mouse estimating the distance of the platform relative to the grids. Three sizes of grids reduced in a proportion of 2/3 and seven distances with an equal interval between the platform and the grids at the bottom were applied in the experiments. The duration of a mouse staying on the platform at each height was recorded when the different sizes of the grids were presented randomly to test whether the Judgment of the mouse for the depth of the platform from the bottom was affected by the size information of the visual target. The results from all conditions of three object sizes show that time of mice staying on the platform became longer with the increase in height. In distance of 20 similar to 30 cm, the mice did not use the size information of a target to judge the depth, while mainly used the information of binocular disparity. In distance less than 20 cm or more than 30 cm, however, especially in much higher distance 50 cm, 60 cm and 70 cm, the mice were able to use the size information to do so in order to compensate the lack of binocular disparity information from both eyes. Because the mice have only 1/3 of the visual field that is binocular. This behavioral paradigm established in the current study is a useful model and can be applied to the experiments using transgenic mouse as an animal model to investigate the relationships between behaviors and gene functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

On the basis of DBF nets proposed by Wang Shoujue, the model and properties of DBF neural network were discussed in this paper. When applied in pattern recognition, the algorithm and implement on hardware were presented respectively. We did experiments on recognition of omnidirectionally oriented rigid objects on the same level, using direction basis function neural networks, which acts by the method of covering the high dimensional geometrical distribution of the sample set in the feature space. Many animal and vehicle models (even with rather similar shapes) were recognized omnidirectionally thousands of times. For total 8800 tests, the correct recognition rate is 98.75%, the error rate and the rejection rate are 0.5% and 1.25% respectively. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a new scheme for omnidirectional object-recognition in free space. The proposed scheme divides above problem into several onmidirectional object-recognition with different depression angles. An onmidirectional object-recognition system with oblique observation directions based on a new recognition theory-Biomimetic Pattern Recognition (BPR) is discussed in detail. Based on it, we can get the size of training samples in the onmidirectional object-recognition system in free space. Omnidirection ally cognitive tests were done on various kinds of animal models of rather similar shapes. For the total 8400 tests, the correct recognition rate is 99.89%. The rejection rate is 0.11% and on the condition of zero error rates. Experimental results are presented to show that the proposed approach outperforms three types of SVMs with either a three degree polynomial kernel or a radial basis function kernel.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we introduce a weighted complex networks model to investigate and recognize structures of patterns. The regular treating in pattern recognition models is to describe each pattern as a high-dimensional vector which however is insufficient to express the structural information. Thus, a number of methods are developed to extract the structural information, such as different feature extraction algorithms used in pre-processing steps, or the local receptive fields in convolutional networks. In our model, each pattern is attributed to a weighted complex network, whose topology represents the structure of that pattern. Based upon the training samples, we get several prototypal complex networks which could stand for the general structural characteristics of patterns in different categories. We use these prototypal networks to recognize the unknown patterns. It is an attempt to use complex networks in pattern recognition, and our result shows the potential for real-world pattern recognition. A spatial parameter is introduced to get the optimal recognition accuracy, and it remains constant insensitive to the amount of training samples. We have discussed the interesting properties of the prototypal networks. An approximate linear relation is found between the strength and color of vertexes, in which we could compare the structural difference between each category. We have visualized these prototypal networks to show that their topology indeed represents the common characteristics of patterns. We have also shown that the asymmetric strength distribution in these prototypal networks brings high robustness for recognition. Our study may cast a light on understanding the mechanism of the biologic neuronal systems in object recognition as well.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Global information is considered the primitive of visual perception in Gestalt psychology. Further, L. Chen ( 2005) proposed a new theory of topological visual perception. According to this theory, the perception of topological difference is faster than o

Relevância:

90.00% 90.00%

Publicador:

Resumo:

According to the research results reported in the past decades, it is well acknowledged that face recognition is not a trivial task. With the development of electronic devices, we are gradually revealing the secret of object recognition in the primate's visual cortex. Therefore, it is time to reconsider face recognition by using biologically inspired features. In this paper, we represent face images by utilizing the C1 units, which correspond to complex cells in the visual cortex, and pool over S1 units by using a maximum operation to reserve only the maximum response of each local area of S1 units. The new representation is termed C1 Face. Because C1 Face is naturally a third-order tensor (or a three dimensional array), we propose three-way discriminative locality alignment (TWDLA), an extension of the discriminative locality alignment, which is a top-level discriminate manifold learning-based subspace learning algorithm. TWDLA has the following advantages: (1) it takes third-order tensors as input directly so the structure information can be well preserved; (2) it models the local geometry over every modality of the input tensors so the spatial relations of input tensors within a class can be preserved; (3) it maximizes the margin between a tensor and tensors from other classes over each modality so it performs well for recognition tasks and (4) it has no under sampling problem. Extensive experiments on YALE and FERET datasets show (1) the proposed C1Face representation can better represent face images than raw pixels and (2) TWDLA can duly preserve both the local geometry and the discriminative information over every modality for recognition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Eye detection plays an important role in many practical applications. This paper presents a novel two-step scheme for eye detection. The first step models an eye by a newly defined visual-context pattern (VCP), and the second step applies semisupervised boosting for precise detection. VCP describes both the space and appearance relations between an eye region (region of eye) and a reference region (region of reference). The context feature of a VCP is extracted by using the integral image. Aiming to reduce the human labeling efforts, we apply semisupervised boosting, which integrates the context feature and the Haar-like features for precise eye detection. Experimental results on several standard face data sets demonstrate that the proposed approach is effective, robust, and efficient. We finally show that this approach is ready for practical applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Crowding, generally defined as the deleterious influence of nearby contours on visual discrimination, is ubiquitous in spatial vision. Specifically, long-range effects of non-overlapping distracters can alter the appearance of an object, making it unrecognizable. Theories in many domains, including vision computation and high-level attention, have been proposed to account for crowding. However, neither compulsory averaging model nor insufficient spatial esolution of attention provides an adequate explanation for crowding. The present study examined the effects of perceptual organization on crowding. We hypothesize that target-distractor segmentation in crowding is analogous to figure-ground segregation in Gestalt. When distractors can be grouped as a whole or when they are similar to each other but different from the target, the target can be distinguished from distractors. However, grouping target and distractors together by Gestalt principles may interfere with target-distractor separation. Six experiments were carried out to assess our theory. In experiments 1, 2, and 3, we manipulated the similarity between target and distractor as well as the configuration of distractors to investigate the effects of stimuli-driven grouping on target-distractor segmentation. In experiments 4, 5, and 6, we focused on the interaction between bottom-up and top-down processes of grouping, and their influences on target-distractor segmentation. Our results demonstrated that: (a) when distractors were similar to each other but different from target, crowding was eased; (b) when distractors formed a subjective contour or were placed regularly, crowding was also reduced; (c) both bottom-up and top-down processes could influence target-distractor grouping, mediating the effects of crowding. These results support our hypothesis that the figure-ground segregation and target-distractor segmentation in crowding may share similar processes. The present study not only provides a novel explanation for crowding, but also examines the processing bottleneck in object recognition. These findings have significant implications on computer vision and interface design as well as on clinical practice in amblyopia and dyslexia.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

目标识别技术在现实生活中的很多领域都有广泛的应用,但是由于遮挡,视角变换等因素的影响,目标识别技术仍面临着巨大的挑战。局部特征由于其本身固有的局部性,引起了人们的重视。结合空间分布约束,局部特征可以包含高层的语义信息,能够提高目标识别算法抗遮挡和视角变化的能力。本文分析对比了当前流行的局部特征检测方法,描述方法以及空间分布约束方法,并提出了一种“中心-特征”结构模型以及相应的目标识别方法。 首先介绍局部特征检测方法,深入研究局部特征描述方法,并从原理,不变性,匹配速度,适用情形等方面进行了比较分析。 综合显式模型和隐式模型的优缺点,提出了一种“中心-特征”结构的模型。该模型以目标中心作为衡量所有局部特征之间位置关系的参考点,既保留了星形模型等的准确性,同时又去掉了特殊结点,避免了特殊结点缺失带来的不利影响,提高了算法的稳定性。 基于上述空间分布约束模型提出了相应的目标识别算法。该算法同时考虑表面特征和空间位置之间的匹配程度。基于模板中目标的表面特征和形状因素构造空间分布约束模型,利用待检测目标的表面特征信息形成相关假设,通过假设检验定量衡量目标出现的位置及可能性,并提出了一种搜索目标中心位置的加速算法。实验验证了算法在相似变换及仿射变换下的有效性,且具有一定的抗缺失能力。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

光照是影响成像的关键因素之一。当光照条件变化时,同一物体的不同成像之间的差异极大,有时甚至大于不同物体的成像之间的差异。在很多目标识别应用场景中,光照又常常不受人为控制,这使得光照变化条件下的目标识别成为一个普遍而具有挑战性的问题。 本文深入分析了光照特性如强度、方向和颜色等的改变对目标成像的影响;研究了目前流行的各种光照鲁棒的目标识别方法,介绍它们的算法原理,分析光照鲁棒的原因,算法的适用条件等。 提出了一种在低照度条件下基于图像频域特征的目标识别方法,该方法通过分析空频域仿射变换之间的关系,采取对梯度图像的傅氏频谱进行伪对数采样的特征提取方法,较好地提取了中低频特征,抑制了高频噪声,避免了光照变化带来的不利影响;使用神经网络进行识别,有效地提取了目标的仿射不变特征,识别速度快。 提出了一种光照鲁棒的非线性相关目标识别方法。该方法采取一种信息分解的策略,将灰度信息分解为描述存在变化的区域和区域内变化程度两个描述分量,选择比较有区分力的部分像素参与匹配;以向量之间夹角的大小作为相似度度量,直接利用图像的灰度信息,在高维向量空间中考虑图像之间的相似度,克服了在低照度、低信噪比的图像中求边缘、角点和形状等特征时面临的困难。该相似度度量不受向量模的大小(乘性光照变化)以及向量平移(加性光照变化)的影响,是线性光照不变的。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

针对经典形状上下文算法对物体关节相对位置变化敏感的缺点,提出一种基于剪影局部形状填充率的物体识别算法.该算法以物体不同的轮廓控制点为圆心,计算不同半径下物体剪影像素所占总像素的比例,即为控制点的局部形状填充率;将不同控制点、不同半径长度所计算的形状填充率数值构成一个特征矩阵,该矩阵反映了物体整个剪影的统计特性.通过不同数据库的实验结果表明,文中算法对物体的细节有很强的描述能力,对物体关节的相对位置不敏感,并且受剪影轮廓控制点数量影响小.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

随着移动机器人应用范围的日益扩展,在动态、非结构化环境下提高其自主导航能力已经成为移动机器人研究领域迫切需要解决的问题。在机器人自主导航关键技术中,识别技术是最难解决、也是最急需解决的问题。视觉作为导航中的重要传感器,与其他传感器相比具有信息量大、重量轻便、功耗低等诸多优势,因此基于视觉的识别技术也被公认为最具潜力的研究方向。 本文以国防基础研究项目和中科院开放实验室基金项目为依托,以沈阳自动化所自主研发的“轮腿复合结构机器人”和“无人机”为实验平台,针对地面自主机器人和无人机自主导航中迫切需要解决的应用问题,有针对性的展开研究,旨在提高移动机器人在动态、非结构化环境下的适应能力。 本论文的主要内容如下: 首先,为了提高复杂环境下地面移动机器人的自主能力,本文提出了一种基于立体视觉的面向室外非结构化环境障碍物检测算法。文中首先给出了一种可以从V视差图(V-disparity image)中有效估计地面主视差(Main Ground Disparity, MGD)的方法。随后,我们利用由粗到精逐步判断的方式,来识别疑似障碍和最终障碍并对障碍进行定位。最后,该方法已在地面自主移动平台得到实际应用。通过在各种场景下的实验,验证了该方法的准确性和快速性。 其次,以无人机天际线识别为背景,提出了一种准确、实时的天际线识别算法,并由此估计姿态角。通过对天际线建立能量泛函模型,利用变分原理推出相应偏微分方程。在实际应用中出于对实时性的考虑,引入分段直线约束对该模型进行简化,然后利用由粗到精的思想识别天际线。具体做法是:首先,对图像预处理并垂直剖分,然后利用简化的水平直线模型对天际线进行粗识别,通过拟合获得天际线粗识别结果,最后在基于梯度和区域混合开曲线模型约束下精确识别天际线,并由此估计无人机滚动和俯仰姿态角。 第三,通过对红外机场跑道的目标特性进行分析,文中设计了一种新的基于1D Haar 小波的并行的红外图像分割算法的;然后,有针对性的对分割区域提取特征;最后,两种常用的识别方法,支持向量机(SVM)和投票法(Voting)被用于对疑似目标区域进行分类和识别。通过对实际视频和红外仿真图片的测试,验证了本文算法的快速性、可靠性和实时性,该算法每帧平均处理时间为30ms。 最后,针对无人机空中巡逻中对人群进行自动监控所遇到的问题,通过将此类问题简化为固定视角下人流密度监测问题,提出了一种全新的基于速度场估计的越线人流计数和区域内人流密度估计算法。 首先,该算法把越线的人流当成运动的流场,给出了一种有效估计1D速度场的运动估计模型;然后,通过对动态人流进行速度估计和积分,将越线人流的拼接成动态区域;最后,对各个动态区域提取面积和边缘信息,利用回归分析实现对人流密度估计。该方法与以往基于场景学习的方法不同,本文是一种基于角度的学习,因此便于实际应用。

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The color change induced by triple hydrogen-bonding recognition between melamine and a cyanuric acid derivative grafted on the surface of gold nanoparticles can be used for reliable detection of melamine. Since such a color change can be readily seen by the naked eye, the method enables on-site and real-time detection of melamine in raw milk and infant formula even at a concentration as low as 2.5 ppb without the aid of any advanced instruments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The distinguishment between the object appearance and the background is the useful cues available for visual tracking in which the discriminant analysis is widely applied However due to the diversity of the background observation there are not adequate negative samples from the background which usually lead the discriminant method to tracking failure Thus a natural solution is to construct an object-background pair constrained by the spatial structure which could not only reduce the neg-sample number but also make full use of the background information surrounding the object However this Idea is threatened by the variant of both the object appearance and the spatial-constrained background observation especially when the background shifts as the moving of the object Thus an Incremental pairwise discriminant subspace is constructed in this paper to delineate the variant of the distinguishment In order to maintain the correct the ability of correctly describing the subspace we enforce two novel constraints for the optimal adaptation (1) pairwise data discriminant constraint and (2) subspace smoothness The experimental results demonstrate that the proposed approach can alleviate adaptation drift and achieve better visual tracking results for a large variety of nonstationary scenes (C) 2010 Elsevier B V All rights reserved