971 resultados para Appearance-based localisation


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a mapping and navigation system for a mobile robot, which uses vision as its sole sensor modality. The system enables the robot to navigate autonomously, plan paths and avoid obstacles using a vision based topometric map of its environment. The map consists of a globally-consistent pose-graph with a local 3D point cloud attached to each of its nodes. These point clouds are used for direction independent loop closure and to dynamically generate 2D metric maps for locally optimal path planning. Using this locally semi-continuous metric space, the robot performs shortest path planning instead of following the nodes of the graph --- as is done with most other vision-only navigation approaches. The system exploits the local accuracy of visual odometry in creating local metric maps, and uses pose graph SLAM, visual appearance-based place recognition and point clouds registration to create the topometric map. The ability of the framework to sustain vision-only navigation is validated experimentally, and the system is provided as open-source software.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we explore the effectiveness of patch-based gradient feature extraction methods when applied to appearance-based gait recognition. Extending existing popular feature extraction methods such as HOG and LDP, we propose a novel technique which we term the Histogram of Weighted Local Directions (HWLD). These 3 methods are applied to gait recognition using the GEI feature, with classification performed using SRC. Evaluations on the CASIA and OULP datasets show significant improvements using these patch-based methods over existing implementations, with the proposed method achieving the highest recognition rate for the respective datasets. In addition, the HWLD can easily be extended to 3D, which we demonstrate using the GEV feature on the DGD dataset, observing improvements in performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

After first observing a person, the task of person re-identification involves recognising an individual at different locations across a network of cameras at a later time. Traditionally, this task has been performed by first extracting appearance features of an individual and then matching these features to the previous observation. However, identifying an individual based solely on appearance can be ambiguous, particularly when people wear similar clothing (i.e. people dressed in uniforms in sporting and school settings). This task is made more difficult when the resolution of the input image is small as is typically the case in multi-camera networks. To circumvent these issues, we need to use other contextual cues. In this paper, we use "group" information as our contextual feature to aid in the re-identification of a person, which is heavily motivated by the fact that people generally move together as a collective group. To encode group context, we learn a linear mapping function to assign each person to a "role" or position within the group structure. We then combine the appearance and group context cues using a weighted summation. We demonstrate how this improves performance of person re-identification in a sports environment over appearance based-features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Person re-identification is particularly challenging due to significant appearance changes across separate camera views. In order to re-identify people, a representative human signature should effectively handle differences in illumination, pose and camera parameters. While general appearance-based methods are modelled in Euclidean spaces, it has been argued that some applications in image and video analysis are better modelled via non-Euclidean manifold geometry. To this end, recent approaches represent images as covariance matrices, and interpret such matrices as points on Riemannian manifolds. As direct classification on such manifolds can be difficult, in this paper we propose to represent each manifold point as a vector of similarities to class representers, via a recently introduced form of Bregman matrix divergence known as the Stein divergence. This is followed by using a discriminative mapping of similarity vectors for final classification. The use of similarity vectors is in contrast to the traditional approach of embedding manifolds into tangent spaces, which can suffer from representing the manifold structure inaccurately. Comparative evaluations on benchmark ETHZ and iLIDS datasets for the person re-identification task show that the proposed approach obtains better performance than recent techniques such as Histogram Plus Epitome, Partial Least Squares, and Symmetry-Driven Accumulation of Local Features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Facial expression recognition (FER) has been dramatically developed in recent years, thanks to the advancements in related fields, especially machine learning, image processing and human recognition. Accordingly, the impact and potential usage of automatic FER have been growing in a wide range of applications, including human-computer interaction, robot control and driver state surveillance. However, to date, robust recognition of facial expressions from images and videos is still a challenging task due to the difficulty in accurately extracting the useful emotional features. These features are often represented in different forms, such as static, dynamic, point-based geometric or region-based appearance. Facial movement features, which include feature position and shape changes, are generally caused by the movements of facial elements and muscles during the course of emotional expression. The facial elements, especially key elements, will constantly change their positions when subjects are expressing emotions. As a consequence, the same feature in different images usually has different positions. In some cases, the shape of the feature may also be distorted due to the subtle facial muscle movements. Therefore, for any feature representing a certain emotion, the geometric-based position and appearance-based shape normally changes from one image to another image in image databases, as well as in videos. This kind of movement features represents a rich pool of both static and dynamic characteristics of expressions, which playa critical role for FER. The vast majority of the past work on FER does not take the dynamics of facial expressions into account. Some efforts have been made on capturing and utilizing facial movement features, and almost all of them are static based. These efforts try to adopt either geometric features of the tracked facial points, or appearance difference between holistic facial regions in consequent frames or texture and motion changes in loca- facial regions. Although achieved promising results, these approaches often require accurate location and tracking of facial points, which remains problematic.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an online, unsupervised training algorithm enabling vision-based place recognition across a wide range of changing environmental conditions such as those caused by weather, seasons, and day-night cycles. The technique applies principal component analysis to distinguish between aspects of a location’s appearance that are condition-dependent and those that are condition-invariant. Removing the dimensions associated with environmental conditions produces condition-invariant images that can be used by appearance-based place recognition methods. This approach has a unique benefit – it requires training images from only one type of environmental condition, unlike existing data-driven methods that require training images with labelled frame correspondences from two or more environmental conditions. The method is applied to two benchmark variable condition datasets. Performance is equivalent or superior to the current state of the art despite the lesser training requirements, and is demonstrated to generalise to previously unseen locations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors. Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance. © 2008 Springer Berlin Heidelberg.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to its importance, video segmentation has regained interest recently. However, there is no common agreement about the necessary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmentation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second contribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods. © 2013 Springer-Verlag.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

本文设计与实现了一种基于TMS320DM642的香烟小包装外观质量检测系统,详细阐述了该系统的硬件构成、软件流程、检测算法以及针对DSP处理器进行的系统优化。系统以TMS320DM642处理器为核心建立硬件平台,通过摄像头获取香烟小包装图像,采用改进的模板匹配算法对当前图像进行质量检测,最终在监视器上显示检测结果并将检测结果送执行单元进行处理。实验结果表明基于TMS320DM642的香烟小包装检测系统,检测效果快速、准确、有效,应用前景广泛。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

香烟小包装在线实时检测系统是一种烟草行业产品包装检测设备,具有广阔的应用前景。在现代生产过程中,生产速度越来越快,对产品质量的要求也越来越高。烟草企业在香烟的生产过程中,从烟叶制丝到卷、接、包装都已经实现了自动化。香烟小包装的外观质量,反映了烟厂的技术装备水平,涉及到企业的形象、信誉问题,同时,有质量缺陷的香烟小包装被市场反馈回企业,也会带来企业成本的增加。 随着计算机软件、硬件技术的发展,以及机器视觉理论的完善,采用机器视觉的方法来检测香烟小包装的外观质量,已经开始应用。机器视觉在检测方面具有先天优势检测速度快、分辨能力高、规范化程度高、可重复性好。采用现代机器视觉技术来进行香烟小包装外观质量的检测,可以大大降低检验人员的劳动强度,提高产品的质量,减少烟厂的人力成本和管理成本,改善企业形象。 本文分析了国内外烟包包装质量检测的许多方法,设计了一套基于DSP的香烟小包装外观质量检测系统,可以对香烟小包装进行实时检测,达到实时剔除有包装质量缺陷的香烟小包装的目的。 从机器视觉的角度出发,本文阐述了视觉检测系统的工作原理、总体机构及系统的工作流程,同时对比各种硬件特性,进行了光源、传感器、相机、镜头及DSP芯片的选型,着重介绍了图像处理算法,尤其是本系统中用到的图像配准、模板匹配以及各种缺陷的识别进行了详细的描述,并给出了程序在DSP中的优化方法。 本文对设计的系统进行了一系列的实验和测试,结果表明,本系统具有速度快,总体检测效果好,稳定性好的特点,可以达到香烟小包装实时检测的要求。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The media tends to represent female athletes as women first and athletes second (Koivula, 1 999). The present study investigated whether this same trend was present for female sportscasters, using a self-presentational framework. Self-presentation is the process by which people try to control how others see them (Leary, 1995). One factor that may influence the type of image they try to project is their roles held in society, including gender roles. The gender roles for a man include dominance, assertiveness, and masculinity, while the gender roles for a woman include nurturer, femininity, and attractiveness (Deaux & Major, 1 987). By contrast, sports broadcasters are expected to be knowledgeable, assertive, and competent. Research suggests that female sports broadcasters are seen as less competent and less persuasive than male sports broadcasters (Mitrook & Dorr, 2001; Ordman & Zillmann, 1994, Toro, 2005). One reason for this difference may be that the gender roles for a man are much more similar to those of a sportscaster, compared to those of a woman. Thus, there may be a conflict between the two roles for women. The present study investigated whether the gender and perceived attractiveness of sportscasters influenced the audience's perceptions of the level of competence that a sportscaster demonstrates. Two hundred and four male (n =75) and female (n =129) undergraduate students were recruited from a southern Ontario university to participate in the study. The average age of the male participants was 21 .23 years {SD =1 .60), and the average age for female participants was 20.67 years {SD = 1 .31). The age range for all participants was from 19 to 30 years {M = 20.87 years, SD = 1 .45). Af^er providing informed consent, participants randomly received one of four possible questionnaire packages. The participants answered the demographic questionnaire, and then proceeded to view the picture and read the script of a sports newscast. Next, based on the picture and script, the participants answered the competence questionnaire, assessing the general, sport specific, and overall competence of the sportscaster. Once participants had finished, they returned the package to the researcher and were thanked for their time. Data was analyzed using an ANOVA to determine if general sport competence differs with respect to gender and attractiveness of the sportscaster. Overall, the ANOVA was non-significant (p > .05), indicating no differences on the dependent variable based on gender (F (3, 194) = .631, p = .426), attractiveness (F (3, 194) = .070, p = .791), or the interaction of the two {F (3, 194) = .043,/? = .836). Although none of the study hypotheses were supported, the study provided some insight to the perceived competence of female sportscasters. It is possible that female sportscasters are now seen as competent in the area of sports. Sample characteristics could also have influenced these results; the participants in the current study were primarily physical education and kinesiology students, who had experience participating in physical activity with both men and women. Future research should investigate this issue further by using a video sportscast. It is possible that delivery characteristics such as voice quality or eye contact may also impact perceptions of sportscasters.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A visual SLAM system has been implemented and optimised for real-time deployment on an AUV equipped with calibrated stereo cameras. The system incorporates a novel approach to landmark description in which landmarks are local sub maps that consist of a cloud of 3D points and their associated SIFT/SURF descriptors. Landmarks are also sparsely distributed which simplifies and accelerates data association and map updates. In addition to landmark-based localisation the system utilises visual odometry to estimate the pose of the vehicle in 6 degrees of freedom by identifying temporal matches between consecutive local sub maps and computing the motion. Both the extended Kalman filter and unscented Kalman filter have been considered for filtering the observations. The output of the filter is also smoothed using the Rauch-Tung-Striebel (RTS) method to obtain a better alignment of the sequence of local sub maps and to deliver a large-scale 3D acquisition of the surveyed area. Synthetic experiments have been performed using a simulation environment in which ray tracing is used to generate synthetic images for the stereo system

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an enhanced hypothesis verification strategy for 3D object recognition. A new learning methodology is presented which integrates the traditional dichotomic object-centred and appearance-based representations in computer vision giving improved hypothesis verification under iconic matching. The "appearance" of a 3D object is learnt using an eigenspace representation obtained as it is tracked through a scene. The feature representation implicitly models the background and the objects observed enabling the segmentation of the objects from the background. The method is shown to enhance model-based tracking, particularly in the presence of clutter and occlusion, and to provide a basis for identification. The unified approach is discussed in the context of the traffic surveillance domain. The approach is demonstrated on real-world image sequences and compared to previous (edge-based) iconic evaluation techniques.