894 resultados para FACE RECOGNITION
Resumo:
Nowadays, a lot of applications use digital images. For example in face recognition to detect and tag persons in photograph, for security control, and a lot of applications that can be found in smart cities, as speed control in roads or highways and cameras in traffic lights to detect drivers ignoring red light. Also in medicine digital images are used, such as x-ray, scanners, etc. These applications depend on the quality of the image obtained. A good camera is expensive, and the image obtained depends also on external factor as light. To make these applications work properly, image enhancement is as important as, for example, a good face detection algorithm. Image enhancement also can be used in normal photograph, for pictures done in bad light conditions, or just to improve the contrast of an image. There are some applications for smartphones that allow users apply filters or change the bright, colour or contrast on the pictures. This project compares four different techniques to use in image enhancement. After applying one of these techniques to an image, it will use better the whole available dynamic range. Some of the algorithms are designed for grey scale images and others for colour images. It is used Matlab software to develop and present the final results. These algorithms are Successive Means Quantization Transform (SMQT), Histogram Equalization, using Matlab function and own implemented function, and V transform. Finally, as conclusions, we can prove that Histogram equalization algorithm is the simplest of all, it has a wide variability of grey levels and it is not suitable for colour images. V transform algorithm is a good option for colour images. The algorithm is linear and requires low computational power. SMQT algorithm is non-linear, insensitive to gain and bias and it can extract structure of the data. RESUMEN. Hoy en día incontable número de aplicaciones usan imágenes digitales. Por ejemplo, para el control de la seguridad se usa el reconocimiento de rostros para detectar y etiquetar personas en fotografías o vídeos, para distintos usos de las ciudades inteligentes, como control de velocidad en carreteras o autopistas, cámaras en los semáforos para detectar a conductores haciendo caso omiso de un semáforo en rojo, etc. También en la medicina se utilizan imágenes digitales, como por ejemplo, rayos X, escáneres, etc. Todas estas aplicaciones dependen de la calidad de la imagen obtenida. Una buena cámara es cara, y la imagen obtenida depende también de factores externos como la luz. Para hacer que estas aplicaciones funciones correctamente, el tratamiento de imagen es tan importante como, por ejemplo, un buen algoritmo de detección de rostros. La mejora de la imagen también se puede utilizar en la fotografía no profesional o de consumo, para las fotos realizadas en malas condiciones de luz, o simplemente para mejorar el contraste de una imagen. Existen aplicaciones para teléfonos móviles que permiten a los usuarios aplicar filtros y cambiar el brillo, el color o el contraste en las imágenes. Este proyecto compara cuatro técnicas diferentes para utilizar el tratamiento de imagen. Se utiliza la herramienta de software matemático Matlab para desarrollar y presentar los resultados finales. Estos algoritmos son Successive Means Quantization Transform (SMQT), Ecualización del histograma, usando la propia función de Matlab y una nueva función que se desarrolla en este proyecto y, por último, una función de transformada V. Finalmente, como conclusión, podemos comprobar que el algoritmo de Ecualización del histograma es el más simple de todos, tiene una amplia variabilidad de niveles de gris y no es adecuado para imágenes en color. El algoritmo de transformada V es una buena opción para imágenes en color, es lineal y requiere baja potencia de cálculo. El algoritmo SMQT no es lineal, insensible a la ganancia y polarización y, gracias a él, se puede extraer la estructura de los datos.
Resumo:
Like faces, body postures are susceptible to an inversion effect in untrained viewers. The inversion effect may be indicative of configural processing, but what kind of configural processing is used for the recognition of body postures must be specified. The information available in the body stimulus was manipulated. The presence and magnitude of inversion effects were compared for body parts, scrambled bodies, and body halves relative to whole bodies and to corresponding conditions for faces and houses. Results suggest that configural body posture recognition relies on the structural hierarchy of body parts, not the parts themselves or a complete template match. Configural recognition of body postures based on information about the structural hierarchy of parts defines an important point on the configural processing continuum, between recognition based on first-order spatial relations and recognition based on holistic undifferentiated template matching.
Resumo:
This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the thermal mid-wave infrared portion of the electromagnetic spectrum. The goals of this research is to design specialized algorithms that would extract facial vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal infrared imaging. The proposed thermal facial signature recognition is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through testing our algorithm on a database, referred to as C-X1, provided by the Computer Vision Research Laboratory at the University of Notre Dame. Feature extraction was accomplished by first registering the infrared images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal infrared images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate thermal signatures and a thermal template for each subject, the thermal template contains only the most prevalent and consistent features. Finally a similarity measure technique was used to match signatures to templates and the Principal Component Analysis (PCA) was used to validate the results of the matching process. Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using an Euclidean-based similarity measure showed 88% accuracy in the case of skeletonized signatures and templates, we obtained 90% accuracy for anisotropically diffused signatures and templates. We also employed the Manhattan-based similarity measure and obtained an accuracy of 90.39% for skeletonized and diffused templates and signatures. It was found that an average 18.9% improvement in the similarity measure was obtained when using diffused templates. The Euclidean- and Manhattan-based similarity measure was also applied to skeletonized signatures and templates of 25 subjects in the C-X1 database. The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the thermal infrared system to be used on other thermal imaging based systems and related databases. A novel user-initialization registration of thermal facial images has been successfully implemented. Furthermore, the novel approach at developing a thermal signature template using four images taken at various times ensured that unforeseen changes in the vasculature did not affect the biometric matching process as it relied on consistent thermal features.
Resumo:
Larger lineups could protect innocent suspects from being misidentified; however, they can also decrease correct identifications. Bertrand (2006) investigated whether the decrease in correct identifications could be prevented by adding more cues, in the form of additional views of lineup members’ faces, to the lineup. Adding these cues was successful to an extent. The current series of studies attempted to replicate Bertrand’s (2006) findings while addressing some methodological issues—namely, the inconsistency in image size as lineup size increased. First, I investigated whether image size could affect face recognition (Chapter 2) and found it could, but that it also affected previously-seen (“old”) versus previously-unseen (“new”) faces differently. Specifically, smaller image sizes at exposure lowered accuracy for old faces, while these same image sizes at recognition lowered accuracy for new faces. Although these results indicate that target recognition would be unaffected by image size at recognition (i.e., during a lineup), lineups are also comprised of previously-unseen faces, in the form of fillers and innocent suspects. Because image size could affect lineup decisions, as it could become more difficult to realize fillers are previously-unseen, I decided to replicate Bertrand (2006) while keeping image size constant in Chapters 3 (simultaneous lineups) and 4 (simultaneous-presentation, sequential decisions). In both Chapters, the integral findings were the same: correct identification rates decreased as lineup size increased from 6- to 24-person lineups, but adding cues had no effect. The inability to replicate Bertrand (2006) could mean that the original finding was due to chance, but alternate explanations also exist, such as the overall size of the array, the degree to which additional cues overlap, and the length of the target exposure. These alternate explanations, along with directions for future research, are discussed in the following Chapters.
Resumo:
The police use both subjective (i.e. police staff) and automated (e.g. face recognition systems) methods for the completion of visual tasks (e.g person identification). Image quality for police tasks has been defined as the image usefulness, or image suitability of the visual material to satisfy a visual task. It is not necessarily affected by any artefact that may affect the visual image quality (i.e. decrease fidelity), as long as these artefacts do not affect the relevant useful information for the task. The capture of useful information will be affected by the unconstrained conditions commonly encountered by CCTV systems such as variations in illumination and high compression levels. The main aim of this thesis is to investigate aspects of image quality and video compression that may affect the completion of police visual tasks/applications with respect to CCTV imagery. This is accomplished by investigating 3 specific police areas/tasks utilising: 1) the human visual system (HVS) for a face recognition task, 2) automated face recognition systems, and 3) automated human detection systems. These systems (HVS and automated) were assessed with defined scene content properties, and video compression, i.e. H.264/MPEG-4 AVC. The performance of imaging systems/processes (e.g. subjective investigations, performance of compression algorithms) are affected by scene content properties. No other investigation has been identified that takes into consideration scene content properties to the same extend. Results have shown that the HVS is more sensitive to compression effects in comparison to the automated systems. In automated face recognition systems, `mixed lightness' scenes were the most affected and `low lightness' scenes were the least affected by compression. In contrast the HVS for the face recognition task, `low lightness' scenes were the most affected and `medium lightness' scenes the least affected. For the automated human detection systems, `close distance' and `run approach' are some of the most commonly affected scenes. Findings have the potential to broaden the methods used for testing imaging systems for security applications.
Resumo:
[EN]Most face recognition systems are based on some form of batch learning. Online face recognition is not only more practical, it is also much more biologically plausible. Typical batch learners aim at minimizing both training error and (a measure of) hypothesis complexity. We show that the same minimization can be done incrementally as long as some form of ”scaffolding” is applied throughout the learning process. Scaffolding means: make the system learn from samples that are neither too easy nor too difficult at each step. We note that such learning behavior is also biologically plausible. Experiments using large sequences of facial images support the theoretical claims. The proposed method compares well with other, numerical calculus-based online learners.
Resumo:
Dissertação (mestrado)—Universidade de Brasília, Instituto de Psicologia, Departamento de Processos Psicológicos Básicos, Programa de Pós-Graduação em Ciências do Comportamento, 2016.
Resumo:
Selling devices on retail stores comes with the big challenge of grabbing the customer’s attention. Nowadays people have a lot of offers at their disposal and new marketing techniques must emerge to differentiate the products. When it comes to smartphones and tablets, those devices can make the difference by themselves, if we use their computing power and capabilities to create something unique and interactive. With that in mind, three prototypes were developed during an internship: a face recognition based Customer Detection, a face tracking solution with an Avatar and interactive cross-app Guides. All three revealed to have potential to be differentiating solutions in a retail store, not only raising the chance of a customer taking notice of the device but also of interacting with them to learn more about their features. The results were meant to be only proof of concepts and therefore were not tested in the real world.
Resumo:
People possess different sensory modalities to detect, interpret, and efficiently act upon various events in a complex and dynamic environment (Fetsch, DeAngelis, & Angelaki, 2013). Much empirical work has been done to understand the interplay of modalities (e.g. audio-visual interactions, see Calvert, Spence, & Stein, 2004). On the one hand, integration of multimodal input as a functional principle of the brain enables the versatile and coherent perception of the environment (Lewkowicz & Ghazanfar, 2009). On the other hand, sensory integration does not necessarily mean that input from modalities is always weighted equally (Ernst, 2008). Rather, when two or more modalities are stimulated concurrently, one often finds one modality dominating over another. Study 1 and 2 of the dissertation addressed the developmental trajectory of sensory dominance. In both studies, 6-year-olds, 9-year-olds, and adults were tested in order to examine sensory (audio-visual) dominance across different age groups. In Study 3, sensory dominance was put into an applied context by examining verbal and visual overshadowing effects among 4- to 6-year olds performing a face recognition task. The results of Study 1 and Study 2 support default auditory dominance in young children as proposed by Napolitano and Sloutsky (2004) that persists up to 6 years of age. For 9-year-olds, results on privileged modality processing were inconsistent. Whereas visual dominance was revealed in Study 1, privileged auditory processing was revealed in Study 2. Among adults, a visual dominance was observed in Study 1, which has also been demonstrated in preceding studies (see Spence, Parise, & Chen, 2012). No sensory dominance was revealed in Study 2 for adults. Potential explanations are discussed. Study 3 referred to verbal and visual overshadowing effects in 4- to 6-year-olds. The aim was to examine whether verbalization (i.e., verbally describing a previously seen face), or visualization (i.e., drawing the seen face) might affect later face recognition. No effect of visualization on recognition accuracy was revealed. As opposed to a verbal overshadowing effect, a verbal facilitation effect occurred. Moreover, verbal intelligence was a significant predictor for recognition accuracy in the verbalization group but not in the control group. This suggests that strengthening verbal intelligence in children can pay off in non-verbal domains as well, which might have educational implications.
Resumo:
The low resolution of images has been one of the major limitations in recognising humans from a distance using their biometric traits, such as face and iris. Superresolution has been employed to improve the resolution and the recognition performance simultaneously, however the majority of techniques employed operate in the pixel domain, such that the biometric feature vectors are extracted from a super-resolved input image. Feature-domain superresolution has been proposed for face and iris, and is shown to further improve recognition performance by capitalising on direct super-resolving the features which are used for recognition. However, current feature-domain superresolution approaches are limited to simple linear features such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which are not the most discriminant features for biometrics. Gabor-based features have been shown to be one of the most discriminant features for biometrics including face and iris. This paper proposes a framework to conduct super-resolution in the non-linear Gabor feature domain to further improve the recognition performance of biometric systems. Experiments have confirmed the validity of the proposed approach, demonstrating superior performance to existing linear approaches for both face and iris biometrics.
Resumo:
Subspace learning is the process of finding a proper feature subspace and then projecting high-dimensional data onto the learned low-dimensional subspace. The projection operation requires many floating-point multiplications and additions, which makes the projection process computationally expensive. To tackle this problem, this paper proposes two simple-but-effective fast subspace learning and image projection methods, fast Haar transform (FHT) based principal component analysis and FHT based spectral regression discriminant analysis. The advantages of these two methods result from employing both the FHT for subspace learning and the integral vector for feature extraction. Experimental results on three face databases demonstrated their effectiveness and efficiency.
Resumo:
Models of visual perception are based on image representations in cortical area V1 and higher areas which contain many cell layers for feature extraction. Basic simple, complex and end-stopped cells provide input for line, edge and keypoint detection. In this paper we present an improved method for multi-scale line/edge detection based on simple and complex cells. We illustrate the line/edge representation for object reconstruction, and we present models for multi-scale face (object) segregation and recognition that can be embedded into feedforward dorsal and ventral data streams (the “what” and “where” subsystems) with feedback streams from higher areas for obtaining translation, rotation and scale invariance.
Resumo:
Several studies investigated the role of featural and configural information when processing facial identity. A lot less is known about their contribution to emotion recognition. In this study, we addressed this issue by inducing either a featural or a configural processing strategy (Experiment 1) and by investigating the attentional strategies in response to emotional expressions (Experiment 2). In Experiment 1, participants identified emotional expressions in faces that were presented in three different versions (intact, blurred, and scrambled) and in two orientations (upright and inverted). Blurred faces contain mainly configural information, and scrambled faces contain mainly featural information. Inversion is known to selectively hinder configural processing. Analyses of the discriminability measure (A′) and response times (RTs) revealed that configural processing plays a more prominent role in expression recognition than featural processing, but their relative contribution varies depending on the emotion. In Experiment 2, we qualified these differences between emotions by investigating the relative importance of specific features by means of eye movements. Participants had to match intact expressions with the emotional cues that preceded the stimulus. The analysis of eye movements confirmed that the recognition of different emotions rely on different types of information. While the mouth is important for the detection of happiness and fear, the eyes are more relevant for anger, fear, and sadness.
Resumo:
Three studies (N=144) investigated how toddlers aged 18 and 24 months pass the surprise-mark test of self-recognition. In Study 1, toddlers were surreptitiously marked in successive conditions on their legs and faces with stickers visible only in a mirror. Rates of sticker touching did not differ significantly between conditions. In Study 2, toddlers failed to touch a sticker on their legs that had been disguised before being marked. In Study 3, having been given 30-s exposure to their disguised legs before testing, toddlers touched the stickers on their legs and faces at equivalent levels. These results suggest that toddlers pass the mark test based on expectations about what they look like, expectations that are not restricted to the face.