905 resultados para face recognition algorithms
Resumo:
Previous electrophysiological studies revealed that human faces elicit an early visual event-related potential (ERP) within the occipito-temporal cortex, the N170 component. Although face perception has been proposed to rely on automatic processing, the impact of selective attention on N170 remains controversial both in young and elderly individuals. Using early visual ERP and alpha power analysis, we assessed the influence of aging on selective attention to faces during delayed-recognition tasks for face and letter stimuli, examining 36 elderly and 20 young adults with preserved cognition. Face recognition performance worsened with age. Aging induced a latency delay of the N1 component for faces and letters, as well as of the face N170 component. Contrasting with letters, ignored faces elicited larger N1 and N170 components than attended faces in both age groups. This counterintuitive attention effect on face processing persisted when scenes replaced letters. In contrast with young, elderly subjects failed to suppress irrelevant letters when attending faces. Whereas attended stimuli induced a parietal alpha band desynchronization within 300-1000 ms post-stimulus with bilateral-to-right distribution for faces and left lateralization for letters, ignored and passively viewed stimuli elicited a central alpha synchronization larger on the right hemisphere. Aging delayed the latency of this alpha synchronization for both face and letter stimuli, and reduced its amplitude for ignored letters. These results suggest that due to their social relevance, human faces may cause paradoxical attention effects on early visual ERP components, but they still undergo classical top-down control as a function of endogenous selective attention. Aging does not affect the face bottom-up alerting mechanism but reduces the top-down suppression of distracting letters, possibly impinging upon face recognition, and more generally delays the top-down suppression of task-irrelevant information.
Resumo:
La informació biomètrica s'ha convertit en una tecnologia complementària a la criptografia que permet administrar còmodament les dades criptogràfiques. Són útils dues necessitats importants: en primer lloc, posar aquestes dades sempre a mà i, a més, fent fàcilment identificable el seu legítim propietari. En aquest article es proposa un sistema que integra la signatura biomètrica de reconeixement facial amb un esquema de signatura basat en la identitat, de manera que la cara de l'usuari esdevé la seva clau pública i la ID del sistema. D'aquesta manera, altres usuaris poden verificar els missatges utilitzant fotos del remitent, proporcionant un intercanvi raonable entre la seguretat del sistema i la usabilitat, així com una manera molt més senzilla d'autenticar claus públiques i processos de distribució.
Resumo:
Psychophysical studies suggest that humans preferentially use a narrow band of low spatial frequencies for face recognition. Here we asked whether artificial face recognition systems have an improved recognition performance at the same spatial frequencies as humans. To this end, we estimated recognition performance over a large database of face images by computing three discriminability measures: Fisher Linear Discriminant Analysis, Non-Parametric Discriminant Analysis, and Mutual Information. In order to address frequency dependence, discriminabilities were measured as a function of (filtered) image size. All three measures revealed a maximum at the same image sizes, where the spatial frequency content corresponds to the psychophysical found frequencies. Our results therefore support the notion that the critical band of spatial frequencies for face recognition in humans and machines follows from inherent properties of face images, and that the use of these frequencies is associated with optimal face recognition performance.
Resumo:
As important social stimuli, faces playa critical role in our lives. Much of our interaction with other people depends on our ability to recognize faces accurately. It has been proposed that face processing consists of different stages and interacts with other systems (Bruce & Young, 1986). At a perceptual level, the initial two stages, namely structural encoding and face recognition, are particularly relevant and are the focus of this dissertation. Event-related potentials (ERPs) are averaged EEG signals time-locked to a particular event (such as the presentation of a face). With their excellent temporal resolution, ERPs can provide important timing information about neural processes. Previous research has identified several ERP components that are especially related to face processing, including the N 170, the P2 and the N250. Their nature with respect to the stages of face processing is still unclear, and is examined in Studies 1 and 2. In Study 1, participants made gender decisions on a large set of female faces interspersed with a few male faces. The ERP responses to facial characteristics of the female faces indicated that the N 170 amplitude from each side of the head was affected by information from eye region and by facial layout: the right N 170 was affected by eye color and by face width, while the left N 170 was affected by eye size and by the relation between the sizes of the top and bottom parts of a face. In contrast, the P100 and the N250 components were largely unaffected by facial characteristics. These results thus provided direct evidence for the link between the N 170 and structural encoding of faces. In Study 2, focusing on the face recognition stage, we manipulated face identity strength by morphing individual faces to an "average" face. Participants performed a face identification task. The effect of face identity strength was found on the late P2 and the N250 components: as identity strength decreased from an individual face to the "average" face, the late P2 increased and the N250 decreased. In contrast, the P100, the N170 and the early P2 components were not affected by face identity strength. These results suggest that face recognition occurs after 200 ms, but not earlier. Finally, because faces are often associated with social information, we investigated in Study 3 how group membership might affect ERP responses to faces. After participants learned in- and out-group memberships of the face stimuli based on arbitrarily assigned nationality and university affiliation, we found that the N170 latency differentiated in-group and out-group faces, taking longer to process the latter. In comparison, without group memberships, there was no difference in N170 latency among the faces. This dissertation provides evidence that at a neural level, structural encoding of faces, indexed by the N170, occurs within 200 ms. Face recognition, indexed by the late P2 and the N250, occurs shortly afterwards between 200 and 300 ms. Social cognitive factors can also influence face processing. The effect is already evident as early as 130-200 ms at the structural encoding stage.
Resumo:
The objective of this thesis work, is to propose an algorithm to detect the faces in a digital image with complex background. A lot of work has already been done in the area of face detection, but drawback of some face detection algorithms is the lack of ability to detect faces with closed eyes and open mouth. Thus facial features form an important basis for detection. The current thesis work focuses on detection of faces based on facial objects. The procedure is composed of three different phases: segmentation phase, filtering phase and localization phase. In segmentation phase, the algorithm utilizes color segmentation to isolate human skin color based on its chrominance properties. In filtering phase, Minkowski addition based object removal (Morphological operations) has been used to remove the non-skin regions. In the last phase, Image Processing and Computer Vision methods have been used to find the existence of facial components in the skin regions.This method is effective on detecting a face region with closed eyes, open mouth and a half profile face. The experiment’s results demonstrated that the detection accuracy is around 85.4% and the detection speed is faster when compared to neural network method and other techniques.
Resumo:
This paper presents a novel, fast and accurate appearance-based method for infrared face recognition. By introducing the Optimum-Path Forest classifier, our objective is to get good recognition rates and effectively reduce the computational effort. The feature extraction procedure is carried out by PCA, and the results are compared to two other well known supervised learning classifiers; Artificial Neural Networks and Support Vector Machines. The achieved performance asserts the promise of the proposed framework. ©2009 IEEE.
Resumo:
Many methods based on biometrics such as fingerprint, face, iris, and retina have been proposed for person identification. However, for deceased individuals, such biometric measurements are not available. In such cases, parts of the human skeleton can be used for identification, such as dental records, thorax, vertebrae, shoulder, and frontal sinus. It has been established in prior investigations that the radiographic pattern of frontal sinus is highly variable and unique for every individual. This has stimulated the proposition of measurements of the frontal sinus pattern, obtained from x-ray films, for skeletal identification. This paper presents a frontal sinus recognition method for human identification based on Image Foresting Transform and shape context. Experimental results (ERR = 5,82%) have shown the effectiveness of the proposed method.
Resumo:
In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.
Resumo:
The precise role of the fusiform face area (FFA) in face processing remains controversial. In this study, we investigated to what degree FFA activation reflects additional functions beyond face perception. Seven volunteers underwent rapid event-related functional magnetic resonance imaging while they performed a face-encoding and a face-recognition task. During face encoding, activity in the FFA for individual faces predicted whether the individual face was subsequently remembered or forgotten. However, during face recognition, no difference in FFA activity between consciously remembered and forgotten faces was observed, but the activity of FFA differentiated if a face had been seen previously or not. This demonstrated a dissociation between overt recognition and unconscious discrimination of stimuli, suggesting that physiological processes of face recognition can take place, even if not all of its operations are made available to consciousness.
Resumo:
La segmentación de imágenes es un campo importante de la visión computacional y una de las áreas de investigación más activas, con aplicaciones en comprensión de imágenes, detección de objetos, reconocimiento facial, vigilancia de vídeo o procesamiento de imagen médica. La segmentación de imágenes es un problema difícil en general, pero especialmente en entornos científicos y biomédicos, donde las técnicas de adquisición imagen proporcionan imágenes ruidosas. Además, en muchos de estos casos se necesita una precisión casi perfecta. En esta tesis, revisamos y comparamos primero algunas de las técnicas ampliamente usadas para la segmentación de imágenes médicas. Estas técnicas usan clasificadores a nivel de pixel e introducen regularización sobre pares de píxeles que es normalmente insuficiente. Estudiamos las dificultades que presentan para capturar la información de alto nivel sobre los objetos a segmentar. Esta deficiencia da lugar a detecciones erróneas, bordes irregulares, configuraciones con topología errónea y formas inválidas. Para solucionar estos problemas, proponemos un nuevo método de regularización de alto nivel que aprende información topológica y de forma a partir de los datos de entrenamiento de una forma no paramétrica usando potenciales de orden superior. Los potenciales de orden superior se están popularizando en visión por computador, pero la representación exacta de un potencial de orden superior definido sobre muchas variables es computacionalmente inviable. Usamos una representación compacta de los potenciales basada en un conjunto finito de patrones aprendidos de los datos de entrenamiento que, a su vez, depende de las observaciones. Gracias a esta representación, los potenciales de orden superior pueden ser convertidos a potenciales de orden 2 con algunas variables auxiliares añadidas. Experimentos con imágenes reales y sintéticas confirman que nuestro modelo soluciona los errores de aproximaciones más débiles. Incluso con una regularización de alto nivel, una precisión exacta es inalcanzable, y se requeire de edición manual de los resultados de la segmentación automática. La edición manual es tediosa y pesada, y cualquier herramienta de ayuda es muy apreciada. Estas herramientas necesitan ser precisas, pero también lo suficientemente rápidas para ser usadas de forma interactiva. Los contornos activos son una buena solución: son buenos para detecciones precisas de fronteras y, en lugar de buscar una solución global, proporcionan un ajuste fino a resultados que ya existían previamente. Sin embargo, requieren una representación implícita que les permita trabajar con cambios topológicos del contorno, y esto da lugar a ecuaciones en derivadas parciales (EDP) que son costosas de resolver computacionalmente y pueden presentar problemas de estabilidad numérica. Presentamos una aproximación morfológica a la evolución de contornos basada en un nuevo operador morfológico de curvatura que es válido para superficies de cualquier dimensión. Aproximamos la solución numérica de la EDP de la evolución de contorno mediante la aplicación sucesiva de un conjunto de operadores morfológicos aplicados sobre una función de conjuntos de nivel. Estos operadores son muy rápidos, no sufren de problemas de estabilidad numérica y no degradan la función de los conjuntos de nivel, de modo que no hay necesidad de reinicializarlo. Además, su implementación es mucho más sencilla que la de las EDP, ya que no requieren usar sofisticados algoritmos numéricos. Desde un punto de vista teórico, profundizamos en las conexiones entre operadores morfológicos y diferenciales, e introducimos nuevos resultados en este área. Validamos nuestra aproximación proporcionando una implementación morfológica de los contornos geodésicos activos, los contornos activos sin bordes, y los turbopíxeles. En los experimentos realizados, las implementaciones morfológicas convergen a soluciones equivalentes a aquéllas logradas mediante soluciones numéricas tradicionales, pero con ganancias significativas en simplicidad, velocidad y estabilidad. ABSTRACT Image segmentation is an important field in computer vision and one of its most active research areas, with applications in image understanding, object detection, face recognition, video surveillance or medical image processing. Image segmentation is a challenging problem in general, but especially in the biological and medical image fields, where the imaging techniques usually produce cluttered and noisy images and near-perfect accuracy is required in many cases. In this thesis we first review and compare some standard techniques widely used for medical image segmentation. These techniques use pixel-wise classifiers and introduce weak pairwise regularization which is insufficient in many cases. We study their difficulties to capture high-level structural information about the objects to segment. This deficiency leads to many erroneous detections, ragged boundaries, incorrect topological configurations and wrong shapes. To deal with these problems, we propose a new regularization method that learns shape and topological information from training data in a nonparametric way using high-order potentials. High-order potentials are becoming increasingly popular in computer vision. However, the exact representation of a general higher order potential defined over many variables is computationally infeasible. We use a compact representation of the potentials based on a finite set of patterns learned fromtraining data that, in turn, depends on the observations. Thanks to this representation, high-order potentials can be converted into pairwise potentials with some added auxiliary variables and minimized with tree-reweighted message passing (TRW) and belief propagation (BP) techniques. Both synthetic and real experiments confirm that our model fixes the errors of weaker approaches. Even with high-level regularization, perfect accuracy is still unattainable, and human editing of the segmentation results is necessary. The manual edition is tedious and cumbersome, and tools that assist the user are greatly appreciated. These tools need to be precise, but also fast enough to be used in real-time. Active contours are a good solution: they are good for precise boundary detection and, instead of finding a global solution, they provide a fine tuning to previously existing results. However, they require an implicit representation to deal with topological changes of the contour, and this leads to PDEs that are computationally costly to solve and may present numerical stability issues. We present a morphological approach to contour evolution based on a new curvature morphological operator valid for surfaces of any dimension. We approximate the numerical solution of the contour evolution PDE by the successive application of a set of morphological operators defined on a binary level-set. These operators are very fast, do not suffer numerical stability issues, and do not degrade the level set function, so there is no need to reinitialize it. Moreover, their implementation is much easier than their PDE counterpart, since they do not require the use of sophisticated numerical algorithms. From a theoretical point of view, we delve into the connections between differential andmorphological operators, and introduce novel results in this area. We validate the approach providing amorphological implementation of the geodesic active contours, the active contours without borders, and turbopixels. In the experiments conducted, the morphological implementations converge to solutions equivalent to those achieved by traditional numerical solutions, but with significant gains in simplicity, speed, and stability.
Resumo:
El presente proyecto trata sobre uno de los campos más problemáticos de la inteligencia artificial, el reconocimiento facial. Algo tan sencillo para las personas como es reconocer una cara conocida se traduce en complejos algoritmos y miles de datos procesados en cuestión de segundos. El proyecto comienza con un estudio del estado del arte de las diversas técnicas de reconocimiento facial, desde las más utilizadas y probadas como el PCA y el LDA, hasta técnicas experimentales que utilizan imágenes térmicas en lugar de las clásicas con luz visible. A continuación, se ha implementado una aplicación en lenguaje C++ que sea capaz de reconocer a personas almacenadas en su base de datos leyendo directamente imágenes desde una webcam. Para realizar la aplicación, se ha utilizado una de las librerías más extendidas en cuanto a procesado de imágenes y visión artificial, OpenCV. Como IDE se ha escogido Visual Studio 2010, que cuenta con una versión gratuita para estudiantes. La técnica escogida para implementar la aplicación es la del PCA ya que es una técnica básica en el reconocimiento facial, y además sirve de base para soluciones mucho más complejas. Se han estudiado los fundamentos matemáticos de la técnica para entender cómo procesa la información y en qué se datos se basa para realizar el reconocimiento. Por último, se ha implementado un algoritmo de testeo para poder conocer la fiabilidad de la aplicación con varias bases de datos de imágenes faciales. De esta forma, se puede comprobar los puntos fuertes y débiles del PCA. ABSTRACT. This project deals with one of the most problematic areas of artificial intelligence, facial recognition. Something so simple for human as to recognize a familiar face becomes into complex algorithms and thousands of data processed in seconds. The project begins with a study of the state of the art of various face recognition techniques, from the most used and tested as PCA and LDA, to experimental techniques that use thermal images instead of the classic visible light images. Next, an application has been implemented in C + + language that is able to recognize people stored in a database reading images directly from a webcam. To make the application, it has used one of the most outstretched libraries in terms of image processing and computer vision, OpenCV. Visual Studio 2010 has been chosen as the IDE, which has a free student version. The technique chosen to implement the software is the PCA because it is a basic technique in face recognition, and also provides a basis for more complex solutions. The mathematical foundations of the technique have been studied to understand how it processes the information and which data are used to do the recognition. Finally, an algorithm for testing has been implemented to know the reliability of the application with multiple databases of facial images. In this way, the strengths and weaknesses of the PCA can be checked.
Resumo:
Because faces and bodies share some abstract perceptual features, we hypothesised that similar recognition processes might be used for both. We investigated whether similar caricature effects to those found in facial identity and expression recognition could be found in the recognition of individual bodies and socially meaningful body positions. Participants were trained to name four body positions (anger, fear, disgust, sadness) and four individuals (in a neutral position). We then tested their recognition of extremely caricatured, moderately caricatured, anticaricatured, and undistorted images of each stimulus. Consistent with caricature effects found in face recognition, moderately caricatured representations of individuals' bodies were recognised more accurately than undistorted and extremely caricatured representations. No significant difference was found between participants' recognition of extremely caricatured, moderately caricatured, or undistorted body position line-drawings. AU anti-caricatured representations were named significandy less accurately than the veridical stimuli. Similar mental representations may be used for both bodies and faces.
Resumo:
Three experiments assessed the development of children's part and configural (part-relational) processing in object recognition during adolescence. In total, 312 school children aged 7-16 years and 80 adults were tested in 3-alternative forced choice (3-AFC) tasks. They judged the correct appearance of upright and inverted presented familiar animals, artifacts, and newly learned multipart objects, which had been manipulated either in terms of individual parts or part relations. Manipulation of part relations was constrained to either metric (animals, artifacts, and multipart objects) or categorical (multipart objects only) changes. For animals and artifacts, even the youngest children were close to adult levels for the correct recognition of an individual part change. By contrast, it was not until 11-12 years of age that they achieved similar levels of performance with regard to altered metric part relations. For the newly learned multipart objects, performance was equivalent throughout the tested age range for upright presented stimuli in the case of categorical part-specific and part-relational changes. In the case of metric manipulations, the results confirmed the data pattern observed for animals and artifacts. Together, the results provide converging evidence, with studies of face recognition, for a surprisingly late consolidation of configural-metric relative to part-based object recognition.
Resumo:
This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the thermal mid-wave infrared portion of the electromagnetic spectrum. The goals of this research is to design specialized algorithms that would extract facial vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal infrared imaging. The proposed thermal facial signature recognition is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through testing our algorithm on a database, referred to as C-X1, provided by the Computer Vision Research Laboratory at the University of Notre Dame. Feature extraction was accomplished by first registering the infrared images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal infrared images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate thermal signatures and a thermal template for each subject, the thermal template contains only the most prevalent and consistent features. Finally a similarity measure technique was used to match signatures to templates and the Principal Component Analysis (PCA) was used to validate the results of the matching process. Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using an Euclidean-based similarity measure showed 88% accuracy in the case of skeletonized signatures and templates, we obtained 90% accuracy for anisotropically diffused signatures and templates. We also employed the Manhattan-based similarity measure and obtained an accuracy of 90.39% for skeletonized and diffused templates and signatures. It was found that an average 18.9% improvement in the similarity measure was obtained when using diffused templates. The Euclidean- and Manhattan-based similarity measure was also applied to skeletonized signatures and templates of 25 subjects in the C-X1 database. The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the thermal infrared system to be used on other thermal imaging based systems and related databases. A novel user-initialization registration of thermal facial images has been successfully implemented. Furthermore, the novel approach at developing a thermal signature template using four images taken at various times ensured that unforeseen changes in the vasculature did not affect the biometric matching process as it relied on consistent thermal features.
Resumo:
The police use both subjective (i.e. police staff) and automated (e.g. face recognition systems) methods for the completion of visual tasks (e.g person identification). Image quality for police tasks has been defined as the image usefulness, or image suitability of the visual material to satisfy a visual task. It is not necessarily affected by any artefact that may affect the visual image quality (i.e. decrease fidelity), as long as these artefacts do not affect the relevant useful information for the task. The capture of useful information will be affected by the unconstrained conditions commonly encountered by CCTV systems such as variations in illumination and high compression levels. The main aim of this thesis is to investigate aspects of image quality and video compression that may affect the completion of police visual tasks/applications with respect to CCTV imagery. This is accomplished by investigating 3 specific police areas/tasks utilising: 1) the human visual system (HVS) for a face recognition task, 2) automated face recognition systems, and 3) automated human detection systems. These systems (HVS and automated) were assessed with defined scene content properties, and video compression, i.e. H.264/MPEG-4 AVC. The performance of imaging systems/processes (e.g. subjective investigations, performance of compression algorithms) are affected by scene content properties. No other investigation has been identified that takes into consideration scene content properties to the same extend. Results have shown that the HVS is more sensitive to compression effects in comparison to the automated systems. In automated face recognition systems, `mixed lightness' scenes were the most affected and `low lightness' scenes were the least affected by compression. In contrast the HVS for the face recognition task, `low lightness' scenes were the most affected and `medium lightness' scenes the least affected. For the automated human detection systems, `close distance' and `run approach' are some of the most commonly affected scenes. Findings have the potential to broaden the methods used for testing imaging systems for security applications.