982 resultados para depth image
Resumo:
In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
We describe a user assisted technique for 3D stereo conversion from 2D images. Our approach exploits the geometric structure of perspective images including vanishing points. We allow a user to indicate lines, planes, and vanishing points in the input image, and directly employ these as constraints in an image warping framework to produce a stereo pair. By sidestepping explicit construction of a depth map, our approach is applicable to more general scenes and avoids potential artifacts of depth-image-based rendering. Our method is most suitable for scenes with large scale structures such as buildings.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
提出了一种GPU加速的实时基于图像的绘制算法.该算法利用极坐标系生成对物体全方位均匀采样的球面深度图像;然后根据推导的两个预变换公式将单幅球面深度图像预变换到物体包围球的一个与视点相关的切平面上,以生成中间图像;再利用纹理映射生成最终目标图像.利用现代图形硬件的可编程性和并行性,将预变换移植到Vertex Shader来加快绘制速度;利用硬件的光栅化功能来完成图像的插值,以得到连续无洞的结果图像.此外,还在Pixel Shader上进行逐像素的光照以及环境映射的计算,生成高质量的光照效果.最终,文章解决了算法的视点受限问题,并设计了一种动态LOD(Level of Details)算法,实现了一个实时漫游系统,保持了物体间正确的遮挡关系.
Resumo:
Summer Saloon presented at Lion and Lamb an exhibition venue dedicated to showing current painting. My work "Screens" constructs rather than represents an existing architectural space. The painting investigates an experience of phenomenological depth distinct from the model of perspective depth.
Resumo:
Este trabajo esta orientado a resolver el problema de la caracterización de la copa de arboles frutales para la aplicacion localizada de fitosanitarios. Esta propuesta utiliza un mapa de profundidad (Depth image) y una imagen RGB combinadas (RGB-D), proporcionados por el sensor Kinect de Microsoft, para aplicar pesticidas de forma localizada. A través del mapa de profundidad se puede estimar la densidad de la copa y a partir de esta información determinar qué boquillas se deben abrir en cada momento. Se desarrollaron algoritmos implementados en Matlab que permiten además de la adquisición de las imágenes RGB-D, aplicar plaguicidas sólo a hojas y/o frutos según se desee. Estos algoritmos fueron implementados en un software que se comunica con el entorno de desarrollo "Kinect Windows SDK", encargado de extraer las imágenes desde el sensor Kinect. Por otra parte, para identificar hojas, se implementaron algoritmos de clasificación e identificación. Los algoritmos de clasificación utilizados fueron "Fuzzy C-Means con Gustafson Kessel" (FCM-GK) y "K-Means". Los centroides o prototipos de cada clase generados por FCM-GK fueron usados como semilla para K-Means, para acelerar la convergencia del algoritmo y mantener la coherencia temporal en los grupos generados por K-Means. Los algoritmos de clasificación fueron aplicados sobre las imágenes transformadas al espacio de color L*a*b*; específicamente se emplearon los canales a*, b* (canales cromáticos) con el fin de reducir el efecto de la luz sobre los colores. Los algoritmos de clasificación fueron configurados para buscar cuatro grupos: hojas, porosidad, frutas y tronco. Una vez que el clasificador genera los prototipos de los grupos, un clasificador denominado Máquina de Soporte Vectorial, que utiliza como núcleo una función Gaussiana base radial, identifica la clase de interés (hojas). La combinación de estos algoritmos ha mostrado bajos errores de clasificación, rendimiento del 4% de error en la identificación de hojas. Además, estos algoritmos de procesamiento de hasta 8.4 imágenes por segundo, lo que permite su aplicación en tiempo real. Los resultados demuestran la viabilidad de utilizar el sensor "Kinect" para determinar dónde y cuándo aplicar pesticidas. Por otra parte, también muestran que existen limitaciones en su uso, impuesta por las condiciones de luz. En otras palabras, es posible usar "Kinect" en exteriores, pero durante días nublados, temprano en la mañana o en la noche con iluminación artificial, o añadiendo un parasol en condiciones de luz intensa.
Resumo:
This paper is concerned with choosing image features for image based visual servo control and how this choice influences the closed-loop dynamics of the system. In prior work, image features tend to be chosen on the basis of image processing simplicity and noise sensitivity. In this paper we show that the choice of feature directly influences the closed-loop dynamics in task-space. We focus on the depth axis control of a visual servo system and compare analytically various approaches that have been reported recently in the literature. The theoretical predictions are verified by experiment.
Resumo:
This paper considers the question of designing a fully image based visual servo control for a dynamic system. The work is motivated by the ongoing development of image based visual servo control of small aerial robotic vehicles. The observed targets considered are coloured blobs on a flat surface to which the normal direction is known. The theoretical framework is directly applicable to the case of markings on a horizontal floor or landing field. The image features used are a first order spherical moment for position and an image flow measurement for velocity. A fully non-linear adaptive control design is provided that ensures global stability of the closed-loop system. © 2005 IEEE.
Resumo:
Wave-number spectrum technique is proposed to retrieve coastal water depths by means of Synthetic Aperture Radar (SAR) image of waves. Based on the general dispersion relation of ocean waves, the wavelength changes of a surface wave over varying water depths can be derived from SAR. Approaching the analysis of SAR images of waves and using the general dispersion relation of ocean waves, this indirect technique of remote sensing bathymetry has been applied to a coastal region of Xiapu in Fujian Province, China. Results show that this technique is suitable for the coastal waters especially for the near-shore regions with variable water depths.
Resumo:
Omnidirectional cameras offer a much wider field of view than the perspective ones and alleviate the problems due to occlusions. However, both types of cameras suffer from the lack of depth perception. A practical method for obtaining depth in computer vision is to project a known structured light pattern on the scene avoiding the problems and costs involved by stereo vision. This paper is focused on the idea of combining omnidirectional vision and structured light with the aim to provide 3D information about the scene. The resulting sensor is formed by a single catadioptric camera and an omnidirectional light projector. It is also discussed how this sensor can be used in robot navigation applications