851 resultados para computer vision face recognition detection voice recognition sistemi biometrici iOS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a video-based system which interactively captures the geometry of a 3D object in the form of a point cloud, then recognizes and registers known objects in this point cloud in a matter of seconds (fig. 1). In order to achieve interactive speed, we exploit both efficient inference algorithms and parallel computation, often on a GPU. The system can be broken down into two distinct phases: geometry capture, and object inference. We now discuss these in further detail. © 2011 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]Enabling natural human-robot interaction using computer vision based applications requires fast and accurate hand detection. However, previous works in this field assume different constraints, like a limitation in the number of detected gestures, because hands are highly complex objects difficult to locate. This paper presents an approach which integrates temporal coherence cues and hand detection based on wrists using a cascade classifier. With this approach, we introduce three main contributions: (1) a transparent initialization mechanism without user participation for segmenting hands independently of their gesture, (2) a larger number of detected gestures as well as a faster training phase than previous cascade classifier based methods and (3) near real-time performance for hand pose detection in video streams.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hand detection on images has important applications on person activities recognition. This thesis focuses on PASCAL Visual Object Classes (VOC) system for hand detection. VOC has become a popular system for object detection, based on twenty common objects, and has been released with a successful deformable parts model in VOC2007. A hand detection on an image is made when the system gets a bounding box which overlaps with at least 50% of any ground truth bounding box for a hand on the image. The initial average precision of this detector is around 0.215 compared with a state-of-art of 0.104; however, color and frequency features for detected bounding boxes contain important information for re-scoring, and the average precision can be improved to 0.218 with these features. Results show that these features help on getting higher precision for low recall, even though the average precision is similar.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

L'obiettivo principale di questo lavoro di tesi è quello di migliorare gli algoritmi di morphing generation in termini di qualità visiva e di potenzialità di attacco dei sistemi automatici di riconoscimento facciale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usage of Optical Character Recognition’s (OCR, systems is a widely spread technology into the world of Computer Vision and Machine Learning. It is a topic that interest many field, for example the automotive, where becomes a specialized task known as License Plate Recognition, useful for many application from the automation of toll road to intelligent payments. However, OCR systems need to be very accurate and generalizable in order to be able to extract the text of license plates under high variable conditions, from the type of camera used for acquisition to light changes. Such variables compromise the quality of digitalized real scenes causing the presence of noise and degradation of various type, which can be minimized with the application of modern approaches for image iper resolution and noise reduction. Oneclass of them is known as Generative Neural Networks, which are very strong ally for the solution of this popular problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to separate the effects of experience from other characteristics of word frequency (e.g., orthographic distinctiveness), computer science and psychology students rated their experience with computer science technical items and nontechnical items from a wide range of word frequencies prior to being tested for recognition memory of the rated items. For nontechnical items, there was a curvilinear relationship between recognition accuracy and word frequency for both groups of students. The usual superiority of low-frequency words was demonstrated and high-frequency words were recognized least well. For technical items, a similar curvilinear relationship was evident for the psychology students, but for the computer science students, recognition accuracy was inversely related to word frequency. The ratings data showed that subjective experience rather than background word frequency was the better predictor of recognition accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The influence of temporal association on the representation and recognition of objects was investigated. Observers were shown sequences of novel faces in which the identity of the face changed as the head rotated. As a result, observers showed a tendency to treat the views as if they were of the same person. Additional experiments revealed that this was only true if the training sequences depicted head rotations rather than jumbled views: in other words, the sequence had to be spatially as well as temporally smooth. Results suggest that we are continuously associating views of objects to support later recognition, and that we do so not only on the basis of the physical similarity, but also the correlated appearance in time of the objects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Electrocardiographic (ECG) signals are emerging as a recent trend in the field of biometrics. In this paper, we propose a novel ECG biometric system that combines clustering and classification methodologies. Our approach is based on dominant-set clustering, and provides a framework for outlier removal and template selection. It enhances the typical workflows, by making them better suited to new ECG acquisition paradigms that use fingers or hand palms, which lead to signals with lower signal to noise ratio, and more prone to noise artifacts. Preliminary results show the potential of the approach, helping to further validate the highly usable setups and ECG signals as a complementary biometric modality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for humancomputer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of vision-based interaction systems can be the same for all applications and thus facilitate the implementation. In order to test the proposed solutions, three prototypes were implemented. For hand posture recognition, a SVM model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En aquest treball s'explora el camp de la identificació facial de subjectes utilitzant tècniques d'anàlisi multimodal. Això és utilitzant imatges RGB i imatges de profunditat (3D) amb l'objecte de validar les diverses tècniques emprades en el reconeixement facial i aprofundir en sistemes que incorporen informació tridimensional als algorismes de detecció i identificació facial.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Disease-causing variants of a large number of genes trigger inherited retinal degeneration leading to photoreceptor loss. Because cones are essential for daylight and central vision such as reading, mobility, and face recognition, this review focuses on a variety of animal models for cone diseases. The pertinence of using these models to reveal genotype/phenotype correlations and to evaluate new therapeutic strategies is discussed. Interestingly, several large animal models recapitulate human diseases and can serve as a strong base from which to study the biology of disease and to assess the scale-up of new therapies. Examples of innovative approaches will be presented such as lentiviral-based transgenesis in pigs and adeno-associated virus (AAV)-gene transfer into the monkey eye to investigate the neural circuitry plasticity of the visual system. The models reported herein permit the exploration of common mechanisms that exist between different species and the identification and highlighting of pathways that may be specific to primates, including humans.