Biblioteca Digital

800 resultados para Face recognition from video

Head Tracking via Robust Registration in Texture Map Images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel method for 3D head tracking in the presence of large head rotations and facial expression changes is described. Tracking is formulated in terms of color image registration in the texture map of a 3D surface model. Model appearance is recursively updated via image mosaicking in the texture map as the head orientation varies. The resulting dynamic texture map provides a stabilized view of the face that can be used as input to many existing 2D techniques for face recognition, facial expressions analysis, lip reading, and eye tracking. Parameters are estimated via a robust minimization procedure; this provides robustness to occlusions, wrinkles, shadows, and specular highlights. The system was tested on a variety of sequences taken with low quality, uncalibrated video cameras. Experimental results are reported.

Indexing Methods for Efficient Multiclass Recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many real world image analysis problems, such as face recognition and hand pose estimation, involve recognizing a large number of classes of objects or shapes. Large margin methods, such as AdaBoost and Support Vector Machines (SVMs), often provide competitive accuracy rates, but at the cost of evaluating a large number of binary classifiers, thus making it difficult to apply such methods when thousands or millions of classes need to be recognized. This thesis proposes a filter-and-refine framework, whereby, given a test pattern, a small number of candidate classes can be identified efficiently at the filter step, and computationally expensive large margin classifiers are used to evaluate these candidates at the refine step. Two different filtering methods are proposed, ClassMap and OVA-VS (One-vs.-All classification using Vector Search). ClassMap is an embedding-based method, works for both boosted classifiers and SVMs, and tends to map the patterns and their associated classes close to each other in a vector space. OVA-VS maps OVA classifiers and test patterns to vectors based on the weights and outputs of weak classifiers of the boosting scheme. At runtime, finding the strongest-responding OVA classifier becomes a classical vector search problem, where well-known methods can be used to gain efficiency. In our experiments, the proposed methods achieve significant speed-ups, in some cases up to two orders of magnitude, compared to exhaustive evaluation of all OVA classifiers. This was achieved in hand pose recognition and face recognition systems where the number of classes ranges from 535 to 48,600.

Common-sense reasoning for human action recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel method that leverages reasoning capabilities in a computer vision system dedicated to human action recognition. The proposed methodology is decomposed into two stages. First, a machine learning based algorithm - known as bag of words - gives a first estimate of action classification from video sequences, by performing an image feature analysis. Those results are afterward passed to a common-sense reasoning system, which analyses, selects and corrects the initial estimation yielded by the machine learning algorithm. This second stage resorts to the knowledge implicit in the rationality that motivates human behaviour. Experiments are performed in realistic conditions, where poor recognition rates by the machine learning techniques are significantly improved by the second stage in which common-sense knowledge and reasoning capabilities have been leveraged. This demonstrates the value of integrating common-sense capabilities into a computer vision pipeline. © 2012 Elsevier B.V. All rights reserved.

Face matching impairment in developmental prosopagnosia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developmental prosopagnosia (DP) is commonly referred to as ‘face blindness’, a term that implies a perceptual basis to the condition. However, DP presents as a deficit in face recognition and is diagnosed using memory-based tasks. Here, we test face identification ability in six people with DP, who are severely impaired on face memory tasks, using tasks that do not rely on memory. First, we compared DP to control participants on a standardised test of unfamiliar face matching using facial images taken on the same day and under standardised studio conditions (Glasgow Face Matching Test; GFMT). DP participants did not differ from normative accuracy scores on the GFMT. Second, we tested face matching performance on a test created using images that were sourced from the Internet and so vary substantially due to changes in viewing conditions and in a person’s appearance (Local Heroes Test; LHT). DP participants show significantly poorer matching accuracy on the LHT relative to control participants, for both unfamiliar and familiar face matching. Interestingly, this deficit is specific to ‘match’ trials, suggesting that people with DP may have particular difficulty in matching images of the same person that contain natural day-to-day variations in appearance. We discuss these results in the broader context of individual differences in face matching ability.

Cortical 3D face and object recognition using 2D projections

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. In cortical area V1 exist double-opponent colour blobs, also simple, complex and end-stopped cells which provide input for a multiscale line/edge representation, keypoints for dynamic feature routine, and saliency maps for Focus-of-Attention.

Face recogniton based on the optimal combination of neural networks, eigenfaces and least squares matching methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Report of a research project of the Fachhochschule Hannover, University of Applied Sciences and Arts, Department of Information Technologies. Automatic face recognition increases the security standards at public places and border checkpoints. The picture inside the identification documents could widely differ from the face, that is scanned under random lighting conditions and for unknown poses. The paper describes an optimal combination of three key algorithms of object recognition, that are able to perform in real time. The camera scan is processed by a recurrent neural network, by a Eigenfaces (PCA) method and by a least squares matching algorithm. Several examples demonstrate the achieved robustness and high recognition rate.

Example-Based Face Shape Recovery Using the Zenith Angle of the Surface Normal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method for recovering facial shape using an image of a face and a reference model. The zenith angle of the surface normal is recovered directly from the intensities of the image. The azimuth angle of the reference model is then combined with the calculated zenith angle in order to get a new field of surface normals. After integration of the needle map, the recovered surface has the effect of mapped facial features over the reference model. Experiments demonstrate that for the lambertian case, surface recovery is achieved with high accuracy. For non-Lambertian cases, experiments suggest potential for face recognition applications.

A technology market transfer for the Zonadvanced video technology

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics

Electrophysiological investigations of the timing of face processing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As important social stimuli, faces playa critical role in our lives. Much of our interaction with other people depends on our ability to recognize faces accurately. It has been proposed that face processing consists of different stages and interacts with other systems (Bruce & Young, 1986). At a perceptual level, the initial two stages, namely structural encoding and face recognition, are particularly relevant and are the focus of this dissertation. Event-related potentials (ERPs) are averaged EEG signals time-locked to a particular event (such as the presentation of a face). With their excellent temporal resolution, ERPs can provide important timing information about neural processes. Previous research has identified several ERP components that are especially related to face processing, including the N 170, the P2 and the N250. Their nature with respect to the stages of face processing is still unclear, and is examined in Studies 1 and 2. In Study 1, participants made gender decisions on a large set of female faces interspersed with a few male faces. The ERP responses to facial characteristics of the female faces indicated that the N 170 amplitude from each side of the head was affected by information from eye region and by facial layout: the right N 170 was affected by eye color and by face width, while the left N 170 was affected by eye size and by the relation between the sizes of the top and bottom parts of a face. In contrast, the P100 and the N250 components were largely unaffected by facial characteristics. These results thus provided direct evidence for the link between the N 170 and structural encoding of faces. In Study 2, focusing on the face recognition stage, we manipulated face identity strength by morphing individual faces to an "average" face. Participants performed a face identification task. The effect of face identity strength was found on the late P2 and the N250 components: as identity strength decreased from an individual face to the "average" face, the late P2 increased and the N250 decreased. In contrast, the P100, the N170 and the early P2 components were not affected by face identity strength. These results suggest that face recognition occurs after 200 ms, but not earlier. Finally, because faces are often associated with social information, we investigated in Study 3 how group membership might affect ERP responses to faces. After participants learned in- and out-group memberships of the face stimuli based on arbitrarily assigned nationality and university affiliation, we found that the N170 latency differentiated in-group and out-group faces, taking longer to process the latter. In comparison, without group memberships, there was no difference in N170 latency among the faces. This dissertation provides evidence that at a neural level, structural encoding of faces, indexed by the N170, occurs within 200 ms. Face recognition, indexed by the late P2 and the N250, occurs shortly afterwards between 200 and 300 ms. Social cognitive factors can also influence face processing. The effect is already evident as early as 130-200 ms at the structural encoding stage.

A Generic Deformable Model for Vehicle Recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports the development of a highly parameterised 3-D model able to adopt the shapes of a wide variety of different classes of vehicles (cars, vans, buses, etc), and its subsequent specialisation to a generic car class which accounts for most commonly encountered types of car (includng saloon, hatchback and estate cars). An interactive tool has been developed to obtain sample data for vehicles from video images. A PCA description of the manually sampled data provides a deformable model in which a single instance is described as a 6 parameter vector. Both the pose and the structure of a car can be recovered by fitting the PCA model to an image. The recovered description is sufficiently accurate to discriminate between vehicle sub-classes.

Digital monitoring of broiler breeder behavior for assessment of thermal welfare

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The assessment of welfare issues has been a challenge for poultry producers, and lately welfare standards needs to be reached in order to agree with international market demand. This research proposes the use of continuous behavior monitoring in order to contribute for assessing welfare. A software was developed using the language Clarium. The software managed the recording of data as well as the data searching in the database Firebird. Both software and the observational methodology were tested in a trial conducted inside an environmental chamber, using three genetics of broiler breeders. Behavioral pattern was recorded and correlated to ambient thermal and aerial variation. Monitoring video cameras were placed on the roof facing the used for registering the bird's behavior. From video camera images were recorded during the total period when the ambient was bright, and for analyzing the video images a sample of 15min observation in the morning and 15 min in the afternoon was used, adding up to 30 min daily observation. A specific model so-called behavior was developed inside the software for counting specific behavior and its frequency of occurrence, as well as its duration. Electronic identification was recorded for 24h period. Behavioral video recording images was related to the data recorded using electronic identification.. Statistical analysis of data allowed to identify behavioral differences related to the change in thermal environment, and ultimately indicating thermal stress and departure from welfare conditions.

A video-based biometric authentication for e-learning web applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last years there was an exponential growth in the offering of Web-enabled distance courses and in the number of enrolments in corporate and higher education using this modality. However, the lack of efficient mechanisms that assures user authentication in this sort of environment, in the system login as well as throughout his session, has been pointed out as a serious deficiency. Some studies have been led about possible biometric applications for web authentication. However, password based authentication still prevails. With the popularization of biometric enabled devices and resultant fall of prices for the collection of biometric traits, biometrics is reconsidered as a secure remote authentication form for web applications. In this work, the face recognition accuracy, captured on-line by a webcam in Internet environment, is investigated, simulating the natural interaction of a person in the context of a distance course environment. Partial results show that this technique can be successfully applied to confirm the presence of users throughout the course attendance in an educational distance course. An efficient client/server architecture is also proposed. © 2009 Springer Berlin Heidelberg.

Frontal sinus recognition for human identification

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many methods based on biometrics such as fingerprint, face, iris, and retina have been proposed for person identification. However, for deceased individuals, such biometric measurements are not available. In such cases, parts of the human skeleton can be used for identification, such as dental records, thorax, vertebrae, shoulder, and frontal sinus. It has been established in prior investigations that the radiographic pattern of frontal sinus is highly variable and unique for every individual. This has stimulated the proposition of measurements of the frontal sinus pattern, obtained from x-ray films, for skeletal identification. This paper presents a frontal sinus recognition method for human identification based on Image Foresting Transform and shape context. Experimental results (ERR = 5,82%) have shown the effectiveness of the proposed method.

Deep learning for computer vision: a comparison between convolutional neural networks and hierarchical temporal memories on object recognition tasks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.

Higher-order regularization and morphological techniques for image segmentation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La segmentación de imágenes es un campo importante de la visión computacional y una de las áreas de investigación más activas, con aplicaciones en comprensión de imágenes, detección de objetos, reconocimiento facial, vigilancia de vídeo o procesamiento de imagen médica. La segmentación de imágenes es un problema difícil en general, pero especialmente en entornos científicos y biomédicos, donde las técnicas de adquisición imagen proporcionan imágenes ruidosas. Además, en muchos de estos casos se necesita una precisión casi perfecta. En esta tesis, revisamos y comparamos primero algunas de las técnicas ampliamente usadas para la segmentación de imágenes médicas. Estas técnicas usan clasificadores a nivel de pixel e introducen regularización sobre pares de píxeles que es normalmente insuficiente. Estudiamos las dificultades que presentan para capturar la información de alto nivel sobre los objetos a segmentar. Esta deficiencia da lugar a detecciones erróneas, bordes irregulares, configuraciones con topología errónea y formas inválidas. Para solucionar estos problemas, proponemos un nuevo método de regularización de alto nivel que aprende información topológica y de forma a partir de los datos de entrenamiento de una forma no paramétrica usando potenciales de orden superior. Los potenciales de orden superior se están popularizando en visión por computador, pero la representación exacta de un potencial de orden superior definido sobre muchas variables es computacionalmente inviable. Usamos una representación compacta de los potenciales basada en un conjunto finito de patrones aprendidos de los datos de entrenamiento que, a su vez, depende de las observaciones. Gracias a esta representación, los potenciales de orden superior pueden ser convertidos a potenciales de orden 2 con algunas variables auxiliares añadidas. Experimentos con imágenes reales y sintéticas confirman que nuestro modelo soluciona los errores de aproximaciones más débiles. Incluso con una regularización de alto nivel, una precisión exacta es inalcanzable, y se requeire de edición manual de los resultados de la segmentación automática. La edición manual es tediosa y pesada, y cualquier herramienta de ayuda es muy apreciada. Estas herramientas necesitan ser precisas, pero también lo suficientemente rápidas para ser usadas de forma interactiva. Los contornos activos son una buena solución: son buenos para detecciones precisas de fronteras y, en lugar de buscar una solución global, proporcionan un ajuste fino a resultados que ya existían previamente. Sin embargo, requieren una representación implícita que les permita trabajar con cambios topológicos del contorno, y esto da lugar a ecuaciones en derivadas parciales (EDP) que son costosas de resolver computacionalmente y pueden presentar problemas de estabilidad numérica. Presentamos una aproximación morfológica a la evolución de contornos basada en un nuevo operador morfológico de curvatura que es válido para superficies de cualquier dimensión. Aproximamos la solución numérica de la EDP de la evolución de contorno mediante la aplicación sucesiva de un conjunto de operadores morfológicos aplicados sobre una función de conjuntos de nivel. Estos operadores son muy rápidos, no sufren de problemas de estabilidad numérica y no degradan la función de los conjuntos de nivel, de modo que no hay necesidad de reinicializarlo. Además, su implementación es mucho más sencilla que la de las EDP, ya que no requieren usar sofisticados algoritmos numéricos. Desde un punto de vista teórico, profundizamos en las conexiones entre operadores morfológicos y diferenciales, e introducimos nuevos resultados en este área. Validamos nuestra aproximación proporcionando una implementación morfológica de los contornos geodésicos activos, los contornos activos sin bordes, y los turbopíxeles. En los experimentos realizados, las implementaciones morfológicas convergen a soluciones equivalentes a aquéllas logradas mediante soluciones numéricas tradicionales, pero con ganancias significativas en simplicidad, velocidad y estabilidad. ABSTRACT Image segmentation is an important field in computer vision and one of its most active research areas, with applications in image understanding, object detection, face recognition, video surveillance or medical image processing. Image segmentation is a challenging problem in general, but especially in the biological and medical image fields, where the imaging techniques usually produce cluttered and noisy images and near-perfect accuracy is required in many cases. In this thesis we first review and compare some standard techniques widely used for medical image segmentation. These techniques use pixel-wise classifiers and introduce weak pairwise regularization which is insufficient in many cases. We study their difficulties to capture high-level structural information about the objects to segment. This deficiency leads to many erroneous detections, ragged boundaries, incorrect topological configurations and wrong shapes. To deal with these problems, we propose a new regularization method that learns shape and topological information from training data in a nonparametric way using high-order potentials. High-order potentials are becoming increasingly popular in computer vision. However, the exact representation of a general higher order potential defined over many variables is computationally infeasible. We use a compact representation of the potentials based on a finite set of patterns learned fromtraining data that, in turn, depends on the observations. Thanks to this representation, high-order potentials can be converted into pairwise potentials with some added auxiliary variables and minimized with tree-reweighted message passing (TRW) and belief propagation (BP) techniques. Both synthetic and real experiments confirm that our model fixes the errors of weaker approaches. Even with high-level regularization, perfect accuracy is still unattainable, and human editing of the segmentation results is necessary. The manual edition is tedious and cumbersome, and tools that assist the user are greatly appreciated. These tools need to be precise, but also fast enough to be used in real-time. Active contours are a good solution: they are good for precise boundary detection and, instead of finding a global solution, they provide a fine tuning to previously existing results. However, they require an implicit representation to deal with topological changes of the contour, and this leads to PDEs that are computationally costly to solve and may present numerical stability issues. We present a morphological approach to contour evolution based on a new curvature morphological operator valid for surfaces of any dimension. We approximate the numerical solution of the contour evolution PDE by the successive application of a set of morphological operators defined on a binary level-set. These operators are very fast, do not suffer numerical stability issues, and do not degrade the level set function, so there is no need to reinitialize it. Moreover, their implementation is much easier than their PDE counterpart, since they do not require the use of sophisticated numerical algorithms. From a theoretical point of view, we delve into the connections between differential andmorphological operators, and introduce novel results in this area. We validate the approach providing amorphological implementation of the geodesic active contours, the active contours without borders, and turbopixels. In the experiments conducted, the morphological implementations converge to solutions equivalent to those achieved by traditional numerical solutions, but with significant gains in simplicity, speed, and stability.

«
1
2
...
9
10
11
12
13
14
15
...
53
54
»