920 resultados para Depth camera


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a depth-guided photometric 3D reconstruction method that works solely with a depth camera like the Kinect. Existing methods that fuse depth with normal estimates use an external RGB camera to obtain photometric information and treat the depth camera as a black box that provides a low quality depth estimate. Our contribution to such methods are two fold. Firstly, instead of using an extra RGB camera, we use the infra-red (IR) camera of the depth camera system itself to directly obtain high resolution photometric information. We believe that ours is the first method to use an IR depth camera system in this manner. Secondly, photometric methods applied to complex objects result in numerous holes in the reconstructed surface due to shadows and self-occlusions. To mitigate this problem, we develop a simple and effective multiview reconstruction approach that fuses depth and normal information from multiple viewpoints to build a complete, consistent and accurate 3D surface representation. We demonstrate the efficacy of our method to generate high quality 3D surface reconstructions for some complex 3D figurines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a system for augmenting depth camera output using multispectral photometric stereo. The technique is demonstrated using a Kinect sensor and is able to produce geometry independently for each frame. Improved reconstruction is demonstrated using the Kinect's inbuilt RGB camera and further improvements are achieved by introducing an additional high resolution camera. As well as qualitative improvements in reconstruction a quantitative reduction in temporal noise is shown. As part of the system an approach is presented for relaxing the assumption of multispectral photometric stereo that scenes are of constant chromaticity to the assumption that scenes contain multiple piecewise constant chromaticities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, depth cameras have been widely utilized in camera tracking for augmented and mixed reality. Many of the studies focus on the methods that generate the reference model simultaneously with the tracking and allow operation in unprepared environments. However, methods that rely on predefined CAD models have their advantages. In such methods, the measurement errors are not accumulated to the model, they are tolerant to inaccurate initialization, and the tracking is always performed directly in reference model's coordinate system. In this paper, we present a method for tracking a depth camera with existing CAD models and the Iterative Closest Point (ICP) algorithm. In our approach, we render the CAD model using the latest pose estimate and construct a point cloud from the corresponding depth map. We construct another point cloud from currently captured depth frame, and find the incremental change in the camera pose by aligning the point clouds. We utilize a GPGPU-based implementation of the ICP which efficiently uses all the depth data in the process. The method runs in real-time, it is robust for outliers, and it does not require any preprocessing of the CAD models. We evaluated the approach using the Kinect depth sensor, and compared the results to a 2D edge-based method, to a depth-based SLAM method, and to the ground truth. The results show that the approach is more stable compared to the edge-based method and it suffers less from drift compared to the depth-based SLAM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present an adaptive spatio-temporal filter that aims to improve low-cost depth camera accuracy and stability over time. The proposed system is composed by three blocks that are used to build a reliable depth map of static scenes. An adaptive joint-bilateral filter is used to obtain consistent depth maps by jointly considering depth and video information and by adapting its parameters to different levels of estimated noise. Kalman filters are used to reduce the temporal random fluctuations of the measurements. Finally an interpolation algorithm is used to obtain consistent depth maps in the regions where the depth information is not available. Results show that this approach allows to considerably improve the depth maps quality by considering spatio-temporal information and by adapting its parameters to different levels of noise.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The building sector is the dominant consumer of energy and therefore a major contributor to anthropomorphic climate change. The rapid generation of photorealistic, 3D environment models with incorporated surface temperature data has the potential to improve thermographic monitoring of building energy efficiency. In pursuit of this goal, we propose a system which combines a range sensor with a thermal-infrared camera. Our proposed system can generate dense 3D models of environments with both appearance and temperature information, and is the first such system to be developed using a low-cost RGB-D camera. The proposed pipeline processes depth maps successively, forming an ongoing pose estimate of the depth camera and optimizing a voxel occupancy map. Voxels are assigned 4 channels representing estimates of their true RGB and thermal-infrared intensity values. Poses corresponding to each RGB and thermal-infrared image are estimated through a combination of timestamp-based interpolation and a pre-determined knowledge of the extrinsic calibration of the system. Raycasting is then used to color the voxels to represent both visual appearance using RGB, and an estimate of the surface temperature. The output of the system is a dense 3D model which can simultaneously represent both RGB and thermal-infrared data using one of two alternative representation schemes. Experimental results demonstrate that the system is capable of accurately mapping difficult environments, even in complete darkness.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we present a depth-color scene modeling strategy for indoors 3D contents generation. It combines depth and visual information provided by a low-cost active depth camera to improve the accuracy of the acquired depth maps considering the different dynamic nature of the scene elements. Accurate depth and color models of the scene background are iteratively built, and used to detect moving elements in the scene. The acquired depth data is continuously processed with an innovative joint-bilateral filter that efficiently combines depth and visual information thanks to the analysis of an edge-uncertainty map and the detected foreground regions. The main advantages of the proposed approach are: removing depth maps spatial noise and temporal random fluctuations; refining depth data at object boundaries, generating iteratively a robust depth and color background model and an accurate moving object silhouette.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we propose a novel direction for gait recognition research by proposing a new capture-modality independent, appearance-based feature which we call the Back-filled Gait Energy Image (BGEI). It can can be constructed from both frontal depth images, as well as the more commonly used side-view silhouettes, allowing the feature to be applied across these two differing capturing systems using the same enrolled database. To evaluate this new feature, a frontally captured depth-based gait dataset was created containing 37 unique subjects, a subset of which also contained sequences captured from the side. The results demonstrate that the BGEI can effectively be used to identify subjects through their gait across these two differing input devices, achieving rank-1 match rate of 100%, in our experiments. We also compare the BGEI against the GEI and GEV in their respective domains, using the CASIA dataset and our depth dataset, showing that it compares favourably against them. The experiments conducted were performed using a sparse representation based classifier with a locally discriminating input feature space, which show significant improvement in performance over other classifiers used in gait recognition literature, achieving state of the art results with the GEI on the CASIA dataset.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Timely and comprehensive scene segmentation is often a critical step for many high level mobile robotic tasks. This paper examines a projected area based neighbourhood lookup approach with the motivation towards faster unsupervised segmentation of dense 3D point clouds. The proposed algorithm exploits the projection geometry of a depth camera to find nearest neighbours which is time independent of the input data size. Points near depth discontinuations are also detected to reinforce object boundaries in the clustering process. The search method presented is evaluated using both indoor and outdoor dense depth images and demonstrates significant improvements in speed and precision compared to the commonly used Fast library for approximate nearest neighbour (FLANN) [Muja and Lowe, 2009].

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Capturing high-quality 3D models of insects is challenging - they are usually too small for laser or depth camera based systems, and techniques such as CT scanning do not record color. We have developed a prototype system that generates unprecedentedly high-quality natural-color 3D models of various insects from 3mm to 30 mm in length. Through the use of 3D web standards we are able to use these models to develop novel applications for entomologists and ensure wide accessibility. © 2014 Authors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose to develop a 3-D optical flow features based human action recognition system. Optical flow based features are employed here since they can capture the apparent movement in object, by design. Moreover, they can represent information hierarchically from local pixel level to global object level. In this work, 3-D optical flow based features a re extracted by combining the 2-1) optical flow based features with the depth flow features obtained from depth camera. In order to develop an action recognition system, we employ a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). The m of McFIS is to find the decision boundary separating different classes based on their respective optical flow based features. McFIS consists of a neuro-fuzzy inference system (cognitive component) and a self-regulatory learning mechanism (meta-cognitive component). During the supervised learning, self-regulatory learning mechanism monitors the knowledge of the current sample with respect to the existing knowledge in the network and controls the learning by deciding on sample deletion, sample learning or sample reserve strategies. The performance of the proposed action recognition system was evaluated on a proprietary data set consisting of eight subjects. The performance evaluation with standard support vector machine classifier and extreme learning machine indicates improved performance of McFIS is recognizing actions based of 3-D optical flow based features.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Réalisé en cotutelle avec le laboratoire M2S de Rennes 2

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les chutes chez les personnes âgées représentent un problème important de santé publique. Des études montrent qu’environ 30 % des personnes âgées de 65 ans et plus chutent chaque année au Canada, entraînant des conséquences néfastes sur les plans individuel, familiale et sociale. Face à une telle situation la vidéosurveillance est une solution efficace assurant la sécurité de ces personnes. À ce jour de nombreux systèmes d’assistance de services à la personne existent. Ces dispositifs permettent à la personne âgée de vivre chez elle tout en assurant sa sécurité par le port d'un capteur. Cependant le port du capteur en permanence par le sujet est peu confortable et contraignant. C'est pourquoi la recherche s’est récemment intéressée à l’utilisation de caméras au lieu de capteurs portables. Le but de ce projet est de démontrer que l'utilisation d'un dispositif de vidéosurveillance peut contribuer à la réduction de ce fléau. Dans ce document nous présentons une approche de détection automatique de chute, basée sur une méthode de suivi 3D du sujet en utilisant une caméra de profondeur (Kinect de Microsoft) positionnée à la verticale du sol. Ce suivi est réalisé en utilisant la silhouette extraite en temps réel avec une approche robuste d’extraction de fond 3D basée sur la variation de profondeur des pixels dans la scène. Cette méthode se fondera sur une initialisation par une capture de la scène sans aucun sujet. Une fois la silhouette extraite, les 10% de la silhouette correspondant à la zone la plus haute de la silhouette (la plus proche de l'objectif de la Kinect) sera analysée en temps réel selon la vitesse et la position de son centre de gravité. Ces critères permettront donc après analyse de détecter la chute, puis d'émettre un signal (courrier ou texto) vers l'individu ou à l’autorité en charge de la personne âgée. Cette méthode a été validée à l’aide de plusieurs vidéos de chutes simulées par un cascadeur. La position de la caméra et son information de profondeur réduisent de façon considérable les risques de fausses alarmes de chute. Positionnée verticalement au sol, la caméra permet donc d'analyser la scène et surtout de procéder au suivi de la silhouette sans occultation majeure, qui conduisent dans certains cas à des fausses alertes. En outre les différents critères de détection de chute, sont des caractéristiques fiables pour différencier la chute d'une personne, d'un accroupissement ou d'une position assise. Néanmoins l'angle de vue de la caméra demeure un problème car il n'est pas assez grand pour couvrir une surface conséquente. Une solution à ce dilemme serait de fixer une lentille sur l'objectif de la Kinect permettant l’élargissement de la zone surveillée.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Televisão Digital: Informação e Conhecimento - FAAC

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Keyboards, mice, and touch screens are a potential source of infection or contamination in operating rooms, intensive care units, and autopsy suites. The authors present a low-cost prototype of a system, which allows for touch-free control of a medical image viewer. This touch-free navigation system consists of a computer system (IMac, OS X 10.6 Apple, USA) with a medical image viewer (OsiriX, OsiriX foundation, Switzerland) and a depth camera (Kinect, Microsoft, USA). They implemented software that translates the data delivered by the camera and a voice recognition software into keyboard and mouse commands, which are then passed to OsiriX. In this feasibility study, the authors introduced 10 medical professionals to the system and asked them to re-create 12 images from a CT data set. They evaluated response times and usability of the system compared with standard mouse/keyboard control. Users felt comfortable with the system after approximately 10 minutes. Response time was 120 ms. Users required 1.4 times more time to re-create an image with gesture control. Users with OsiriX experience were significantly faster using the mouse/keyboard and faster than users without prior experience. They rated the system 3.4 out of 5 for ease of use in comparison to the mouse/keyboard. The touch-free, gesture-controlled system performs favorably and removes a potential vector for infection, protecting both patients and staff. Because the camera can be quickly and easily integrated into existing systems, requires no calibration, and is low cost, the barriers to using this technology are low.