885 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Colour-based particle filters have been used exhaustively in the literature given rise to multiple applications However tracking coloured objects through time has an important drawback since the way in which the camera perceives the colour of the object can change Simple updates are often used to address this problem which imply a risk of distorting the model and losing the target In this paper a joint image characteristic-space tracking is proposed which updates the model simultaneously to the object location In order to avoid the curse of dimensionality a Rao-Blackwellised particle filter has been used Using this technique the hypotheses are evaluated depending on the difference between the model and the current target appearance during the updating stage Convincing results have been obtained in sequences under both sudden and gradual illumination condition changes Crown Copyright (C) 2010 Published by Elsevier B V All rights reserved

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distinct neural populations carry signals from short-wave (S) cones. We used individual differences to test whether two types of pathways, those that receive excitatory input (S+) and those that receive inhibitory input (S-), contribute independently to psychophysical performance. We also conducted a genome-wide association study (GWAS) to look for genetic correlates of the individual differences. Our psychophysical test was based on the Cambridge Color Test, but detection thresholds were measured separately for S-cone spatial increments and decrements. Our participants were 1060 healthy adults aged 16-40. Test-retest reliabilities for thresholds were good (ρ=0.64 for S-cone increments, 0.67 for decrements and 0.73 for the average of the two). "Regression scores," isolating variability unique to incremental or decremental sensitivity, were also reliable (ρ=0.53 for increments and ρ=0.51 for decrements). The correlation between incremental and decremental thresholds was ρ=0.65. No genetic markers reached genome-wide significance (p-7). We identified 18 "suggestive" loci (p-5). The significant test-retest reliabilities show stable individual differences in S-cone sensitivity in a normal adult population. Though a portion of the variance in sensitivity is shared between incremental and decremental sensitivity, over 26% of the variance is stable across individuals, but unique to increments or decrements, suggesting distinct neural substrates. Some of the variability in sensitivity is likely to be genetic. We note that four of the suggestive associations found in the GWAS are with genes that are involved in glucose metabolism or have been associated with diabetes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent work suggests that the human ear varies significantly between different subjects and can be used for identification. In principle, therefore, using ears in addition to the face within a recognition system could improve accuracy and robustness, particularly for non-frontal views. The paper describes work that investigates this hypothesis using an approach based on the construction of a 3D morphable model of the head and ear. One issue with creating a model that includes the ear is that existing training datasets contain noise and partial occlusion. Rather than exclude these regions manually, a classifier has been developed which automates this process. When combined with a robust registration algorithm the resulting system enables full head morphable models to be constructed efficiently using less constrained datasets. The algorithm has been evaluated using registration consistency, model coverage and minimalism metrics, which together demonstrate the accuracy of the approach. To make it easier to build on this work, the source code has been made available online.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a novel recurrent neural networkarchitecture for video-based person re-identification.Given the video sequence of a person, features are extracted from each frame using a convolutional neural network that incorporates a recurrent final layer, which allows information to flow between time-steps. The features from all time steps are then combined using temporal pooling to give an overall appearance feature for the complete sequence. The convolutional network, recurrent layer, and temporal pooling layer, are jointly trained to act as a feature extractor for video-based re-identification using a Siamese network architecture.Our approach makes use of colour and optical flow information in order to capture appearance and motion information which is useful for video re-identification. Experiments are conduced on the iLIDS-VID and PRID-2011 datasets to show that this approach outperforms existing methods of video-based re-identification.

https://github.com/niallmcl/Recurrent-Convolutional-Video-ReID
Project Source Code

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Soccer is a sport where everyone that is involved with it make all the efforts aiming for excellence. Not only the players need to show their skills on the pitch but also the coach, and the remaining staff, need to have their own tools so that they can perform at higher levels. Footdata is a project to build a new web application product for soccer (football), which integrates two fundamental components of this sport's world: the social and the professional. While the former is an enhanced social platform for soccer professionals and fans, the later can be considered as a Soccer Resource Planning, featuring a system for acquisition and processing information to meet all the soccer management needs. In this paper we focus only in a specific module of the professional component. We will describe the section of the web application that allows to analyse movements and tactics of the players using images directly taken from the pitch or from videos, we will show that it is possible to draw players and ball movements in a web application and detect if those movements occur during a game. © 2014 Springer International Publishing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Soccer is a sport where everyone that is involved with it make all the efforts aiming for excellence. Not only the players need to show their skills on the pitch but also the coach, and the remaining staff, need to have their own tools so that they can perform at higher levels. Footdata is a project to build a new web application product for soccer (football), which integrates two fundamental components of this sport’s world: the social and the professional. While the former is an enhanced social platform for soccer professionals and fans, the later can be considered as a Soccer Resource Planning, featuring a system for acquisition and processing information to meet all the soccer management needs. In this paper we focus only in a specific module of the professional component. We will describe the section of the web application that allows to analyse movements and tactics of the players using images directly taken from the pitch or from videos, we will show that it is possible to draw players and ball movements in a web application and detect if those movements occur during a game.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of computer vision based quality control has been slowly but steadily gaining importance mainly due to its speed in achieving results and also greatly due to its non- destnictive nature of testing. Besides, in food applications it also does not contribute to contamination. However, computer vision applications in quality control needs the application of an appropriate software for image analysis. Eventhough computer vision based quality control has several advantages, its application has limitations as to the type of work to be done, particularly so in the food industries. Selective applications, however, can be highly advantageous and very accurate.Computer vision based image analysis could be used in morphometric measurements of fish with the same accuracy as the existing conventional method. The method is non-destructive and non-contaminating thus providing anadvantage in seafood processing.The images could be stored in archives and retrieved at anytime to carry out morphometric studies for biologists.Computer vision and subsequent image analysis could be used in measurements of various food products to assess uniformity of size. One product namely cutlet and product ingredients namely coating materials such as bread crumbs and rava were selected for the study. Computer vision based image analysis was used in the measurements of length, width and area of cutlets. Also the width of coating materials like bread crumbs was measured.Computer imaging and subsequent image analysis can be very effectively used in quality evaluations of product ingredients in food processing. Measurement of width of coating materials could establish uniformity of particles or the lack of it. The application of image analysis in bacteriological work was also done

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The report describes a recognition system called GROPER, which performs grouping by using distance and relative orientation constraints that estimate the likelihood of different edges in an image coming from the same object. The thesis presents both a theoretical analysis of the grouping problem and a practical implementation of a grouping system. GROPER also uses an indexing module to allow it to make use of knowledge of different objects, any of which might appear in an image. We test GROPER by comparing it to a similar recognition system that does not use grouping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two formulations of model-based object recognition are described. MAP Model Matching evaluates joint hypotheses of match and pose, while Posterior Marginal Pose Estimation evaluates the pose only. Local search in pose space is carried out with the Expectation--Maximization (EM) algorithm. Recognition experiments are described where the EM algorithm is used to refine and evaluate pose hypotheses in 2D and 3D. Initial hypotheses for the 2D experiments were generated by a simple indexing method: Angle Pair Indexing. The Linear Combination of Views method of Ullman and Basri is employed as the projection model in the 3D experiments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents a statistical framework for object recognition. The framework is motivated by the pictorial structure models introduced by Fischler and Elschlager nearly 30 years ago. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. The problem of detecting an object in an image and the problem of learning an object model using training examples are naturally formulated under a statistical approach. We present efficient algorithms to solve these problems in our framework. We demonstrate our techniques by training models to represent faces and human bodies. The models are then used to locate the corresponding objects in novel images.