885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper provides a comprehensive review of the vision-based See and Avoid problem for unmanned aircraft. The unique problem environment and associated constraints are detailed, followed by an in-depth analysis of visual sensing limitations. In light of such detection and estimation constraints, relevant human, aircraft and robot collision avoidance concepts are then compared from a decision and control perspective. Remarks on system evaluation and certification are also included to provide a holistic review approach. The intention of this work is to clarify common misconceptions, realistically bound feasible design expectations and offer new research directions. It is hoped that this paper will help us to unify design efforts across the aerospace and robotics communities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The neural basis of visual perception can be understood only when the sequence of cortical activity underlying successful recognition is known. The early steps in this processing chain, from retina to the primary visual cortex, are highly local, and the perception of more complex shapes requires integration of the local information. In Study I of this thesis, the progression from local to global visual analysis was assessed by recording cortical magnetoencephalographic (MEG) responses to arrays of elements that either did or did not form global contours. The results demonstrated two spatially and temporally distinct stages of processing: The first, emerging 70 ms after stimulus onset around the calcarine sulcus, was sensitive to local features only, whereas the second, starting at 130 ms across the occipital and posterior parietal cortices, reflected the global configuration. To explore the links between cortical activity and visual recognition, Studies II III presented subjects with recognition tasks of varying levels of difficulty. The occipito-temporal responses from 150 ms onwards were closely linked to recognition performance, in contrast to the 100-ms mid-occipital responses. The averaged responses increased gradually as a function of recognition performance, and further analysis (Study III) showed the single response strengths to be graded as well. Study IV addressed the attention dependence of the different processing stages: Occipito-temporal responses peaking around 150 ms depended on the content of the visual field (faces vs. houses), whereas the later and more sustained activity was strongly modulated by the observers attention. Hemodynamic responses paralleled the pattern of the more sustained electrophysiological responses. Study V assessed the temporal processing capacity of the human object recognition system. Above sufficient luminance, contrast and size of the object, the processing speed was not limited by such low-level factors. Taken together, these studies demonstrate several distinct stages in the cortical activation sequence underlying the object recognition chain, reflecting the level of feature integration, difficulty of recognition, and direction of attention.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Surveying threatened and invasive species to obtain accurate population estimates is an important but challenging task that requires a considerable investment in time and resources. Estimates using existing ground-based monitoring techniques, such as camera traps and surveys performed on foot, are known to be resource intensive, potentially inaccurate and imprecise, and difficult to validate. Recent developments in unmanned aerial vehicles (UAV), artificial intelligence and miniaturized thermal imaging systems represent a new opportunity for wildlife experts to inexpensively survey relatively large areas. The system presented in this paper includes thermal image acquisition as well as a video processing pipeline to perform object detection, classification and tracking of wildlife in forest or open areas. The system is tested on thermal video data from ground based and test flight footage, and is found to be able to detect all the target wildlife located in the surveyed area. The system is flexible in that the user can readily define the types of objects to classify and the object characteristics that should be considered during classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detect and Avoid (DAA) technology is widely acknowledged as a critical enabler for unsegregated Remote Piloted Aircraft (RPA) operations, particularly Beyond Visual Line of Sight (BVLOS). Image-based DAA, in the visible spectrum, is a promising technological option for addressing the challenges DAA presents. Two impediments to progress for this approach are the scarcity of available video footage to train and test algorithms, in conjunction with testing regimes and specifications which facilitate repeatable, statistically valid, performance assessment. This paper includes three key contributions undertaken to address these impediments. In the first instance, we detail our progress towards the creation of a large hybrid collision and near-collision encounter database. Second, we explore the suitability of techniques employed by the biometric research community (Speaker Verification and Language Identification), for DAA performance optimisation and assessment. These techniques include Detection Error Trade-off (DET) curves, Equal Error Rates (EER), and the Detection Cost Function (DCF). Finally, the hybrid database and the speech-based techniques are combined and employed in the assessment of a contemporary, image based DAA system. This system includes stabilisation, morphological filtering and a Hidden Markov Model (HMM) temporal filter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual tracking has been a challenging problem in computer vision over the decades. The applications of Visual Tracking are far-reaching, ranging from surveillance and monitoring to smart rooms. Mean-shift (MS) tracker, which gained more attention recently, is known for tracking objects in a cluttered environment and its low computational complexity. The major problem encountered in histogram-based MS is its inability to track rapidly moving objects. In order to track fast moving objects, we propose a new robust mean-shift tracker that uses both spatial similarity measure and color histogram-based similarity measure. The inability of MS tracker to handle large displacements is circumvented by the spatial similarity-based tracking module, which lacks robustness to object's appearance change. The performance of the proposed tracker is better than the individual trackers for tracking fast-moving objects with better accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

3D Face Recognition is an active area of research for past several years. For a 3D face recognition system one would like to have an accurate as well as low cost setup for constructing 3D face model. In this paper, we use Profilometry approach to obtain a 3D face model.This method gives a low cost solution to the problem of acquiring 3D data and the 3D face models generated by this method are sufficiently accurate. We also develop an algorithm that can use the 3D face model generated by the above method for the recognition purpose.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Real-time object tracking is a critical task in many computer vision applications. Achieving rapid and robust tracking while handling changes in object pose and size, varying illumination and partial occlusion, is a challenging task given the limited amount of computational resources. In this paper we propose a real-time object tracker in l(1) framework addressing these issues. In the proposed approach, dictionaries containing templates of overlapping object fragments are created. The candidate fragments are sparsely represented in the dictionary fragment space by solving the l(1) regularized least squares problem. The non zero coefficients indicate the relative motion between the target and candidate fragments along with a fidelity measure. The final object motion is obtained by fusing the reliable motion information. The dictionary is updated based on the object likelihood map. The proposed tracking algorithm is tested on various challenging videos and found to outperform earlier approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Approximate Nearest Neighbour Field maps are commonly used by computer vision and graphics community to deal with problems like image completion, retargetting, denoising, etc. In this paper, we extend the scope of usage of ANNF maps to medical image analysis, more specifically to optic disk detection in retinal images. In the analysis of retinal images, optic disk detection plays an important role since it simplifies the segmentation of optic disk and other retinal structures. The proposed approach uses FeatureMatch, an ANNF algorithm, to find the correspondence between a chosen optic disk reference image and any given query image. This correspondence provides a distribution of patches in the query image that are closest to patches in the reference image. The likelihood map obtained from the distribution of patches in query image is used for optic disk detection. The proposed approach is evaluated on five publicly available DIARETDB0, DIARETDB1, DRIVE, STARE and MESSIDOR databases, with total of 1540 images. We show, experimentally, that our proposed approach achieves an average detection accuracy of 99% and an average computation time of 0.2 s per image. (C) 2013 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual tracking is an important task in various computer vision applications including visual surveillance, human computer interaction, event detection, video indexing and retrieval. Recent state of the art sparse representation (SR) based trackers show better robustness than many of the other existing trackers. One of the issues with these SR trackers is low execution speed. The particle filter framework is one of the major aspects responsible for slow execution, and is common to most of the existing SR trackers. In this paper,(1) we propose a robust interest point based tracker in l(1) minimization framework that runs at real-time with performance comparable to the state of the art trackers. In the proposed tracker, the target dictionary is obtained from the patches around target interest points. Next, the interest points from the candidate window of the current frame are obtained. The correspondence between target and candidate points is obtained via solving the proposed l(1) minimization problem. In order to prune the noisy matches, a robust matching criterion is proposed, where only the reliable candidate points that mutually match with target and candidate dictionary elements are considered for tracking. The object is localized by measuring the displacement of these interest points. The reliable candidate patches are used for updating the target dictionary. The performance and accuracy of the proposed tracker is benchmarked with several complex video sequences. The tracker is found to be considerably fast as compared to the reported state of the art trackers. The proposed tracker is further evaluated for various local patch sizes, number of interest points and regularization parameters. The performance of the tracker for various challenges including illumination change, occlusion, and background clutter has been quantified with a benchmark dataset containing 50 videos. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the increasing availability of wearable cameras, research on first-person view videos (egocentric videos) has received much attention recently. While some effort has been devoted to collecting various egocentric video datasets, there has not been a focused effort in assembling one that could capture the diversity and complexity of activities related to life-logging, which is expected to be an important application for egocentric videos. In this work, we first conduct a comprehensive survey of existing egocentric video datasets. We observe that existing datasets do not emphasize activities relevant to the life-logging scenario. We build an egocentric video dataset dubbed LENA (Life-logging EgoceNtric Activities) (http://people.sutd.edu.sg/similar to 1000892/dataset) which includes egocentric videos of 13 fine-grained activity categories, recorded under diverse situations and environments using the Google Glass. Activities in LENA can also be grouped into 5 top-level categories to meet various needs and multiple demands for activities analysis research. We evaluate state-of-the-art activity recognition using LENA in detail and also analyze the performance of popular descriptors in egocentric activity recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes an efficient vision-based global topological localization approach that uses a coarse-to-fine strategy. Orientation Adjacency Coherence Histogram (OACH), a novel image feature, is proposed to improve the coarse localization. The coarse localization results are taken as inputs for the fine localization which is carried out by matching Harris-Laplace interest points characterized by the SIFT descriptor. Computation of OACHs and interest points is efficient due to the fact that these features are computed in an integrated process. We have implemented and tested the localization system in real environments. The experimental results demonstrate that our approach is efficient and reliable in both indoor and outdoor environments. © 2006 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel approach using combined features to retrieve images containing specific objects, scenes or buildings. The content of an image is characterized by two kinds of features: Harris-Laplace interest points described by the SIFT descriptor and edges described by the edge color histogram. Edges and corners contain the maximal amount of information necessary for image retrieval. The feature detection in this work is an integrated process: edges are detected directly based on the Harris function; Harris interest points are detected at several scales and Harris-Laplace interest points are found using the Laplace function. The combination of edges and interest points brings efficient feature detection and high recognition ratio to the image retrieval system. Experimental results show this system has good performance. © 2005 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we introduce a weighted complex networks model to investigate and recognize structures of patterns. The regular treating in pattern recognition models is to describe each pattern as a high-dimensional vector which however is insufficient to express the structural information. Thus, a number of methods are developed to extract the structural information, such as different feature extraction algorithms used in pre-processing steps, or the local receptive fields in convolutional networks. In our model, each pattern is attributed to a weighted complex network, whose topology represents the structure of that pattern. Based upon the training samples, we get several prototypal complex networks which could stand for the general structural characteristics of patterns in different categories. We use these prototypal networks to recognize the unknown patterns. It is an attempt to use complex networks in pattern recognition, and our result shows the potential for real-world pattern recognition. A spatial parameter is introduced to get the optimal recognition accuracy, and it remains constant insensitive to the amount of training samples. We have discussed the interesting properties of the prototypal networks. An approximate linear relation is found between the strength and color of vertexes, in which we could compare the structural difference between each category. We have visualized these prototypal networks to show that their topology indeed represents the common characteristics of patterns. We have also shown that the asymmetric strength distribution in these prototypal networks brings high robustness for recognition. Our study may cast a light on understanding the mechanism of the biologic neuronal systems in object recognition as well.

Relevância:

100.00% 100.00%

Publicador: