956 resultados para Face tracking
Resumo:
Introduction: Difficult tracheal intubation remains a constant and significant source of morbidity and mortality in anaesthetic practice. Insufficient airway assessment in the preoperative period continues to be a major cause of unanticipated difficult intubation. Although many risk factors have already been identified, preoperative airway evaluation is not always regarded as a standard procedure and the respective weight of each risk factor remains unclear. Moreover the predictive scores available are not sensitive, moderately specific and often operator-dependant. In order to improve the preoperative detection of patients at risk for difficult intubation, we developed a system for automated and objective evaluation of morphologic criteria of the face and neck using video recordings and advanced techniques borrowed from face recognition. Method and results: Frontal video sequences were recorded in 5 healthy volunteers. During the video recording, subjects were requested to perform maximal flexion-extension of the neck and to open wide the mouth with tongue pulled out. A robust and real-time face tracking system was then applied, allowing to automatically identify and map a grid of 55 control points on the face, which were tracked during head motion. These points located important features of the face, such as the eyebrows, the nose, the contours of the eyes and mouth, and the external contours, including the chin. Moreover, based on this face tracking, the orientation of the head could also be estimated at each frame of the video sequence. Thus, we could infer for each frame the pitch angle of the head pose (related to the vertical rotation of the head) and obtain the degree of head extension. Morphological criteria used in the most frequent cited predictive scores were also extracted, such as mouth opening, degree of visibility of the uvula or thyreo-mental distance. Discussion and conclusion: Preliminary results suggest the high feasibility of the technique. The next step will be the application of the same automated and objective evaluation to patients who will undergo tracheal intubation. The difficulties related to intubation will be then correlated to the biometric characteristics of the patients. The objective in mind is to analyze the biometrics data with artificial intelligence algorithms to build a highly sensitive and specific predictive test.
Resumo:
Background Children with callous-unemotional (CU) traits, a proposed precursor to adult psychopathy, are characterized by impaired emotion recognition, reduced responsiveness to others’ distress, and a lack of guilt or empathy. Reduced attention to faces, and more specifically to the eye region, has been proposed to underlie these difficulties, although this has never been tested longitudinally from infancy. Attention to faces occurs within the context of dyadic caregiver interactions, and early environment including parenting characteristics has been associated with CU traits. The present study tested whether infants’ preferential tracking of a face with direct gaze and levels of maternal sensitivity predict later CU traits. Methods Data were analyzed from a stratified random sample of 213 participants drawn from a population-based sample of 1233 first-time mothers. Infants’ preferential face tracking at 5 weeks and maternal sensitivity at 29 weeks were entered into a weighted linear regression as predictors of CU traits at 2.5 years. Results Controlling for a range of confounders (e.g., deprivation), lower preferential face tracking predicted higher CU traits (p = .001). Higher maternal sensitivity predicted lower CU traits in girls (p = .009), but not boys. No significant interaction between face tracking and maternal sensitivity was found. Conclusions This is the first study to show that attention to social features during infancy as well as early sensitive parenting predict the subsequent development of CU traits. Identifying such early atypicalities offers the potential for developing parent-mediated interventions in children at risk for developing CU traits.
Resumo:
A deteção e seguimento de pessoas tem uma grande variedade de aplicações em visão computacional. Embora tenha sido alvo de anos de investigação, continua a ser um tópico em aberto, e ainda hoje, um grande desafio a obtenção de uma abordagem que inclua simultaneamente exibilidade e precisão. O trabalho apresentado nesta dissertação desenvolve um caso de estudo sobre deteção e seguimento automático de faces humanas, em ambiente de sala de reuniões, concretizado num sistema flexível de baixo custo. O sistema proposto é baseado no sistema operativo GNU's Not Unix (GNU) linux, e é dividido em quatro etapas, a aquisição de vídeo, a deteção da face, o tracking e reorientação da posição da câmara. A aquisição consiste na captura de frames de vídeo das três câmaras Internet Protocol (IP) Sony SNC-RZ25P, instaladas na sala, através de uma rede Local Area Network (LAN) também ele já existente. Esta etapa fornece os frames de vídeo para processamento à detecção e tracking. A deteção usa o algoritmo proposto por Viola e Jones, para a identificação de objetos, baseando-se nas suas principais características, que permite efetuar a deteção de qualquer tipo de objeto (neste caso faces humanas) de uma forma genérica e em tempo real. As saídas da deteção, quando é identificado com sucesso uma face, são as coordenadas do posicionamento da face, no frame de vídeo. As coordenadas da face detetada são usadas pelo algoritmo de tracking, para a partir desse ponto seguir a face pelos frames de vídeo subsequentes. A etapa de tracking implementa o algoritmo Continuously Adaptive Mean-SHIFT (Camshift) que baseia o seu funcionamento na pesquisa num mapa de densidade de probabilidade, do seu valor máximo, através de iterações sucessivas. O retorno do algoritmo são as coordenadas da posição e orientação da face. Estas coordenadas permitem orientar o posicionamento da câmara de forma que a face esteja sempre o mais próximo possível do centro do campo de visão da câmara. Os resultados obtidos mostraram que o sistema de tracking proposto é capaz de reconhecer e seguir faces em movimento em sequências de frames de vídeo, mostrando adequabilidade para aplicação de monotorização em tempo real.
Resumo:
Recent advances in mobile phone cameras have poised them to take over compact hand-held cameras as the consumer’s preferred camera option. Along with advances in the number of pixels, motion blur removal, face-tracking, and noise reduction algorithms have significant roles in the internal processing of the devices. An undesired effect of severe noise reduction is the loss of texture (i.e. low-contrast fine details) of the original scene. Current established methods for resolution measurement fail to accurately portray the texture loss incurred in a camera system. The development of an accurate objective method to identify the texture preservation or texture reproduction capability of a camera device is important in this regard. The ‘Dead Leaves’ target has been used extensively as a method to measure the modulation transfer function (MTF) of cameras that employ highly non-linear noise-reduction methods. This stochastic model consists of a series of overlapping circles with radii r distributed as r−3, and having uniformly distributed gray level, which gives an accurate model of occlusion in a natural setting and hence mimics a natural scene. This target can be used to model the texture transfer through a camera system when a natural scene is captured. In the first part of our study we identify various factors that affect the MTF measured using the ‘Dead Leaves’ chart. These include variations in illumination, distance, exposure time and ISO sensitivity among others. We discuss the main differences of this method with the existing resolution measurement techniques and identify the advantages. In the second part of this study, we propose an improvement to the current texture MTF measurement algorithm. High frequency residual noise in the processed image contains the same frequency content as fine texture detail, and is sometimes reported as such, thereby leading to inaccurate results. A wavelet thresholding based denoising technique is utilized for modeling the noise present in the final captured image. This updated noise model is then used for calculating an accurate texture MTF. We present comparative results for both algorithms under various image capture conditions.
Resumo:
Selling devices on retail stores comes with the big challenge of grabbing the customer’s attention. Nowadays people have a lot of offers at their disposal and new marketing techniques must emerge to differentiate the products. When it comes to smartphones and tablets, those devices can make the difference by themselves, if we use their computing power and capabilities to create something unique and interactive. With that in mind, three prototypes were developed during an internship: a face recognition based Customer Detection, a face tracking solution with an Avatar and interactive cross-app Guides. All three revealed to have potential to be differentiating solutions in a retail store, not only raising the chance of a customer taking notice of the device but also of interacting with them to learn more about their features. The results were meant to be only proof of concepts and therefore were not tested in the real world.
Resumo:
Os serviços baseados em localização vieram dar um novo alento à criatividade dos programadores de aplicações móveis. A vulgarização de dispositivos com capacidades de localização integradas deu origem ao desenvolvimento de aplicações que gerem e apresentam informação baseada na posição do utilizador. Desde então, o mercado móvel tem assistido ao aparecimento de novas categorias de aplicações que tiram proveito desta capacidade. Entre elas, destaca-se a monitorização remota de dispositivos, que tem vindo a assumir uma importância crescente, tanto no sector particular como no sector empresarial. Esta dissertação começa por apresentar o estado da arte sobre os diferentes sistemas de posicionamento, categorizados pela sua eficácia em ambientes internos ou externos, assim como diferentes protocolos de comunicação em tempo quase-real. É também feita uma análise ao estado actual do mercado móvel. Actualmente o mercado possui diferentes plataformas móveis com características únicas que as fazem rivalizar entre si, com vista a expandirem a sua quota de mercado. É por isso elaborado um breve estudo sobre os sistemas operativos móveis mais relevantes da actualidade. É igualmente feita uma abordagem mais profunda à arquitectura da plataforma móvel da Apple - o iOS – que serviu de base ao desenvolvimento de uma solução optimizada para localização e monitorização de dispositivos móveis. A monitorização implica uma utilização intensiva de recursos energéticos e de largura de banda que os dispositivos móveis da actualidade não estão aptos a suportar. Dado o grande consumo energético do GPS face à precária autonomia destes dispositivos, é apresentado um estudo em que se expõem soluções que permitem gerir de forma optimizada a utilização do GPS. O elevado custo dos planos de dados facultados pelas operadoras móveis é também considerado, pelo que são exploradas soluções que visam minimizar a utilização de largura de banda. Deste trabalho, nasce a aplicação EyeGotcha, que para além de permitir localizar outros utilizadores de dispositivos móveis de forma optimizada, permite também monitorizar as suas acções baseando-se num conjunto de regras pré-definidas. Estas acções são reportadas às entidades monitoras, de modo automatizado e sob a forma de alertas. Visionando-se a comercialização da aplicação, é portanto apresentado um modelo de negócio que permite obter receitas capazes de cobrirem os custos de manutenção de serviços, aos quais o funcionamento da aplicação móvel está subjugado.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores
Resumo:
In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.
Resumo:
This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.
Resumo:
Individuals with fragile X syndrome (FXS) commonly display characteristics of social anxiety, including gaze aversion, increased time to initiate social interaction, and difficulty forming meaningful peer relationships. While neural correlates of face processing, an important component of social interaction, are altered in FXS, studies have not examined whether social anxiety in this population is related to higher cognitive processes, such as memory. This study aimed to determine whether the neural circuitry involved in face encoding was disrupted in individuals with FXS, and whether brain activity during face encoding was related to levels of social anxiety. A group of 11 individuals with FXS (5 M) and 11 age-and gender-matched control participants underwent fMRI scanning while performing a face encoding task with onlineeye-tracking. Results indicate that compared to the control group, individuals with FXS exhibited decreased activation of prefrontal regions associated with complex social cognition, including the medial and superior frontal cortex, during successful face encoding. Further, the FXS and control groups showed significantly different relationships between measures of social anxiety (including gaze-fixation) and brain activity during face encoding. These data indicate that social anxiety in FXS may be related to the inability to successfully recruit higher level social cognition regions during the initial phases of memory formation. (C) 2008 Elsevier Inc. All rights reserved.
Video stimuli reduce object-directed imitation accuracy: a novel two-person motion-tracking approach
Resumo:
Imitation is an important form of social behavior, and research has aimed to discover and explain the neural and kinematic aspects of imitation. However, much of this research has featured single participants imitating in response to pre-recorded video stimuli. This is in spite of findings that show reduced neural activation to video vs. real life movement stimuli, particularly in the motor cortex. We investigated the degree to which video stimuli may affect the imitation process using a novel motion tracking paradigm with high spatial and temporal resolution. We recorded 14 positions on the hands, arms, and heads of two individuals in an imitation experiment. One individual freely moved within given parameters (moving balls across a series of pegs) and a second participant imitated. This task was performed with either simple (one ball) or complex (three balls) movement difficulty, and either face-to-face or via a live video projection. After an exploratory analysis, three dependent variables were chosen for examination: 3D grip position, joint angles in the arm, and grip aperture. A cross-correlation and multivariate analysis revealed that object-directed imitation task accuracy (as represented by grip position) was reduced in video compared to face-to-face feedback, and in complex compared to simple difficulty. This was most prevalent in the left-right and forward-back motions, relevant to the imitator sitting face-to-face with the actor or with a live projected video of the same actor. The results suggest that for tasks which require object-directed imitation, video stimuli may not be an ecologically valid way to present task materials. However, no similar effects were found in the joint angle and grip aperture variables, suggesting that there are limits to the influence of video stimuli on imitation. The implications of these results are discussed with regards to previous findings, and with suggestions for future experimentation.
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
Speech is often a multimodal process, presented audiovisually through a talking face. One area of speech perception influenced by visual speech is speech segmentation, or the process of breaking a stream of speech into individual words. Mitchel and Weiss (2013) demonstrated that a talking face contains specific cues to word boundaries and that subjects can correctly segment a speech stream when given a silent video of a speaker. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2013). In Experiment 1, subjects were found to spend the most time watching the eyes and mouth, with a trend suggesting that the mouth was viewed more than the eyes. Although subjects displayed significant learning of word boundaries, performance was not correlated with gaze duration on any individual feature, nor was performance correlated with a behavioral measure of autistic-like traits. However, trends suggested that as autistic-like traits increased, gaze duration of the mouth increased and gaze duration of the eyes decreased, similar to significant trends seen in autistic populations (Boratston & Blakemore, 2007). In Experiment 2, the same video was modified so that a black bar covered the eyes or mouth. Both videos elicited learning of word boundaries that was equivalent to that seen in the first experiment. Again, no correlations were found between segmentation performance and SRS scores in either condition. These results, taken with those in Experiment, suggest that neither the eyes nor mouth are critical to speech segmentation and that perhaps more global head movements indicate word boundaries (see Graf, Cosatto, Strom, & Huang, 2002). Future work will elucidate the contribution of individual features relative to global head movements, as well as extend these results to additional types of speech tasks.
Resumo:
Identification and tracking of objects in specific environments such as harbors or security areas is a matter of great importance nowadays. With this purpose, numerous systems based on different technologies have been developed, resulting in a great amount of gathered data displayed through a variety of interfaces. Such amount of information has to be evaluated by human operators in order to take the correct decisions, sometimes under highly critical situations demanding both speed and accuracy. In order to face this problem we describe IDT-3D, a platform for identification and tracking of vessels in a harbour environment able to represent fused information in real time using a Virtual Reality application. The effectiveness of using IDT-3D as an integrated surveillance system is currently under evaluation. Preliminary results point to a significant decrease in the times of reaction and decision making of operators facing up a critical situation. Although the current application focus of IDT-3D is quite specific, the results of this research could be extended to the identification and tracking of targets in other controlled environments of interest as coastlines, borders or even urban areas.
Resumo:
Despite that Critical Infrastructures (CIs) security and surveillance are a growing concern for many countries and companies, Multi Robot Systems (MRSs) have not been yet broadly used in this type of facilities. This dissertation presents a novel study of the challenges arisen by the implementation of this type of systems and proposes solutions to specific problems. First, a comprehensive analysis of different types of CIs has been carried out, emphasizing the influence of the different characteristics of the facilities in the design of a security and surveillance MRS. One of the most important needs for the surveillance of a CI is the detection of intruders. From a technical point of view this problem can be abstracted as equivalent to the Detection and Tracking of Mobile Objects (DATMO). This dissertation proposes algorithms to solve this specific problem in a CI environment. Using 3D range images of the environment as input data, two detection algorithms for ground robots have been developed. These detection algorithms provide a list of moving objects in the robot detection area. Direct image differentiation and computer vision techniques are used when the robot is static. Alternatively, multi-layer ground reconstructions are compared to detect the dynamic objects when the robot is moving. Since CIs usually spread over large areas, it is very useful to incorporate aerial vehicles in the surveillance MRS. Therefore, a moving object detection algorithm for aerial vehicles has been also developed. This algorithm compares the real optical flow obtained from a down-face oriented camera with an artificial optical flow computed using a RANSAC based homography matrix. Two tracking algorithms have been developed to follow the moving objects trajectories. These algorithms can efficiently handle occlusions and crossings, as well as exchange information among robots. The multirobot tracking can be applied to any type of communication structure: centralized, decentralized or a combination of both. Even more, the developed tracking algorithms are independent of the detection algorithms and could be potentially used with other detection procedures or even with static sensors, such as cameras. In addition, using the 3D point clouds available to the robots, a relative localization algorithm has been developed to improve the position estimation of a given robot with observations from other robots. All the developed algorithms have been extensively tested in different simulated CIs using the Webots robotics simulator. Furthermore, the algorithms have also been validated with real robots operating in real scenarios. In conclusion, this dissertation presents a multirobot approach to Critical Infrastructure Surveillance, mainly focusing on Detecting and Tracking Dynamic Objects.