874 resultados para Abandoned and removed object detection
Resumo:
We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects. We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner).
Resumo:
Abandoned object detection (AOD) systems are required to run in high traffic situations, with high levels of occlusion. Systems rely on background segmentation techniques to locate abandoned objects, by detecting areas of motion that have stopped. This is often achieved by using a medium term motion detection routine to detect long term changes in the background. When AOD systems are integrated into person tracking system, this often results in two separate motion detectors being used to handle the different requirements. We propose a motion detection system that is capable of detecting medium term motion as well as regular motion. Multiple layers of medium term (static) motion can be detected and segmented. We demonstrate the performance of this motion detection system and as part of an abandoned object detection system.
Resumo:
Stationary processes are random variables whose value is a signal and whose distribution is invariant to translation in the domain of the signal. They are intimately connected to convolution, and therefore to the Fourier transform, since the covariance matrix of a stationary process is a Toeplitz matrix, and Toeplitz matrices are the expression of convolution as a linear operator. This thesis utilises this connection in the study of i) efficient training algorithms for object detection and ii) trajectory-based non-rigid structure-from-motion.
Resumo:
本文通过形状约束方程(组)与一般主动轮廓模型结合,将目标形状与主动轮廓模型融合到统一能量泛函模型中,提出了一种形状保持主动轮廓模型即曲线在演化过程中保持为某一类特定形状。模型通过参数化水平集函数的零水平集控制演化曲线形状,不仅达到了分割即目标的目的,而且能够给出特定目标的定量描述。根据形状保持主动轮廓模型,建立了一个用于椭圆状目标检测的统一能量泛函模型,导出了相应的Euler-Lagrange常微分方程并用水平集方法实现了椭圆状目标检测。此模型可以应用于眼底乳头分割,虹膜检测及相机标定。实验结果表明,此模型不仅能够准确的检测出给定图像中的椭圆状目标,而且有很强的抗噪、抗变形及遮挡性能。
Resumo:
Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. Detector training can be accomplished via standard SVM learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the foreground parameters are provided in training, the detectors can also produce parameter estimate. When the foreground object masks are provided in training, the detectors can also produce object segmentation. The advantages of our method over past methods are demonstrated on data sets of human hands and vehicles.
Resumo:
Keypoints (junctions) provide important information for focus-of-attention (FoA) and object categorization/recognition. In this paper we analyze the multi-scale keypoint representation, obtained by applying a linear and quasi-continuous scaling to an optimized model of cortical end-stopped cells, in order to study its importance and possibilities for developing a visual, cortical architecture.We show that keypoints, especially those which are stable over larger scale intervals, can provide a hierarchically structured saliency map for FoA and object recognition. In addition, the application of non-classical receptive field inhibition to keypoint detection allows to distinguish contour keypoints from texture (surface) keypoints.
Resumo:
Hypercolumns in area V1 contain frequency- and orientation-selective simple and complex cells for line (bar) and edge coding, plus end-stopped cells for key- point (vertex) detection. A single-scale (single-frequency) mathematical model of single and double end-stopped cells on the basis of Gabor filter responses was developed by Heitger et al. (1992 Vision Research 32 963-981). We developed an improved model by stabilising keypoint detection over neighbouring micro- scales.
Resumo:
The classical computer vision methods can only weakly emulate some of the multi-level parallelisms in signal processing and information sharing that takes place in different parts of the primates’ visual system thus enabling it to accomplish many diverse functions of visual perception. One of the main functions of the primates’ vision is to detect and recognise objects in natural scenes despite all the linear and non-linear variations of the objects and their environment. The superior performance of the primates’ visual system compared to what machine vision systems have been able to achieve to date, motivates scientists and researchers to further explore this area in pursuit of more efficient vision systems inspired by natural models. In this paper building blocks for a hierarchical efficient object recognition model are proposed. Incorporating the attention-based processing would lead to a system that will process the visual data in a non-linear way focusing only on the regions of interest and hence reducing the time to achieve real-time performance. Further, it is suggested to modify the visual cortex model for recognizing objects by adding non-linearities in the ventral path consistent with earlier discoveries as reported by researchers in the neuro-physiology of vision.
Resumo:
A technique is presented for locating and tracking objects in cluttered environments. Agents are randomly distributed across the image, and subsequently grouped around targets. Each agent uses a weightless neural network and a histogram intersection technique to score its location. The system has been used to locate and track a head in 320x240 resolution video at up to 15fps.
Resumo:
[EN]The human face provides useful information during interaction; therefore, any system integrating Vision- BasedHuman Computer Interaction requires fast and reliable face and facial feature detection. Different approaches have focused on this ability but only open source implementations have been extensively used by researchers. A good example is the Viola–Jones object detection framework that particularly in the context of facial processing has been frequently used.
Resumo:
Automatic visual object counting and video surveillance have important applications for home and business environments, such as security and management of access points. However, in order to obtain a satisfactory performance these technologies need professional and expensive hardware, complex installations and setups, and the supervision of qualified workers. In this paper, an efficient visual detection and tracking framework is proposed for the tasks of object counting and surveillance, which meets the requirements of the consumer electronics: off-the-shelf equipment, easy installation and configuration, and unsupervised working conditions. This is accomplished by a novel Bayesian tracking model that can manage multimodal distributions without explicitly computing the association between tracked objects and detections. In addition, it is robust to erroneous, distorted and missing detections. The proposed algorithm is compared with a recent work, also focused on consumer electronics, proving its superior performance.
Resumo:
Federal Highway Administration, Office of Safety and Traffic Operations Research and Development, McLean, Va.