857 resultados para Facial palsy
Resumo:
Feature extraction and selection are critical processes in developing facial expression recognition (FER) systems. While many algorithms have been proposed for these processes, direct comparison between texture, geometry and their fusion, as well as between multiple selection algorithms has not been found for spontaneous FER. This paper addresses this issue by proposing a unified framework for a comparative study on the widely used texture (LBP, Gabor and SIFT) and geometric (FAP) features, using Adaboost, mRMR and SVM feature selection algorithms. Our experiments on the Feedtum and NVIE databases demonstrate the benefits of fusing geometric and texture features, where SIFT+FAP shows the best performance, while mRMR outperforms Adaboost and SVM. In terms of computational time, LBP and Gabor perform better than SIFT. The optimal combination of SIFT+FAP+mRMR also exhibits a state-of-the-art performance.
Resumo:
Large margin learning approaches, such as support vector machines (SVM), have been successfully applied to numerous classification tasks, especially for automatic facial expression recognition. The risk of such approaches however, is their sensitivity to large margin losses due to the influence from noisy training examples and outliers which is a common problem in the area of affective computing (i.e., manual coding at the frame level is tedious so coarse labels are normally assigned). In this paper, we leverage the relaxation of the parallel-hyperplanes constraint and propose the use of modified correlation filters (MCF). The MCF is similar in spirit to SVMs and correlation filters, but with the key difference of optimizing only a single hyperplane. We demonstrate the superiority of MCF over current techniques on a battery of experiments.
Resumo:
Automated feature extraction and correspondence determination is an extremely important problem in the face recognition community as it often forms the foundation of the normalisation and database construction phases of many recognition and verification systems. This paper presents a completely automatic feature extraction system based upon a modified volume descriptor. These features form a stable descriptor for faces and are utilised in a reversible jump Markov chain Monte Carlo correspondence algorithm to automatically determine correspondences which exist between faces. The developed system is invariant to changes in pose and occlusion and results indicate that it is also robust to minor face deformations which may be present with variations in expression.
Resumo:
Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.
Resumo:
Facial landmarks play an important role in face recognition. They serve different steps of the recognition such as pose estimation, face alignment, and local feature extraction. Recently, cascaded shape regression has been proposed to accurately locate facial landmarks. A large number of weak regressors are cascaded in a sequence to fit face shapes to the correct landmark locations. In this paper, we propose to improve the method by applying gradual training. With this training, the regressors are not directly aimed to the true locations. The sequence instead is divided into successive parts each of which is aimed to intermediate targets between the initial and the true locations. We also investigate the incorporation of pose information in the cascaded model. The aim is to find out whether the model can be directly used to estimate head pose. Experiments on the Annotated Facial Landmarks in the Wild database have shown that the proposed method is able to improve the localization and give accurate estimates of pose.
Resumo:
Techniques to improve the automated analysis of natural and spontaneous facial expressions have been developed. The outcome of the research has applications in several fields including national security (eg: expression invariant face recognition); education (eg: affect aware interfaces); mental and physical health (eg: depression and pain recognition).
Resumo:
Facial expression recognition (FER) systems must ultimately work on real data in uncontrolled environments although most research studies have been conducted on lab-based data with posed or evoked facial expressions obtained in pre-set laboratory environments. It is very difficult to obtain data in real-world situations because privacy laws prevent unauthorized capture and use of video from events such as funerals, birthday parties, marriages etc. It is a challenge to acquire such data on a scale large enough for benchmarking algorithms. Although video obtained from TV or movies or postings on the World Wide Web may also contain ‘acted’ emotions and facial expressions, they may be more ‘realistic’ than lab-based data currently used by most researchers. Or is it? One way of testing this is to compare feature distributions and FER performance. This paper describes a database that has been collected from television broadcasts and the World Wide Web containing a range of environmental and facial variations expected in real conditions and uses it to answer this question. A fully automatic system that uses a fusion based approach for FER on such data is introduced for performance evaluation. Performance improvements arising from the fusion of point-based texture and geometry features, and the robustness to image scale variations are experimentally evaluated on this image and video dataset. Differences in FER performance between lab-based and realistic data, between different feature sets, and between different train-test data splits are investigated.
Resumo:
Robust facial expression recognition (FER) under occluded face conditions is challenging. It requires robust algorithms of feature extraction and investigations into the effects of different types of occlusion on the recognition performance to gain insight. Previous FER studies in this area have been limited. They have spanned recovery strategies for loss of local texture information and testing limited to only a few types of occlusion and predominantly a matched train-test strategy. This paper proposes a robust approach that employs a Monte Carlo algorithm to extract a set of Gabor based part-face templates from gallery images and converts these templates into template match distance features. The resulting feature vectors are robust to occlusion because occluded parts are covered by some but not all of the random templates. The method is evaluated using facial images with occluded regions around the eyes and the mouth, randomly placed occlusion patches of different sizes, and near-realistic occlusion of eyes with clear and solid glasses. Both matched and mis-matched train and test strategies are adopted to analyze the effects of such occlusion. Overall recognition performance and the performance for each facial expression are investigated. Experimental results on the Cohn-Kanade and JAFFE databases demonstrate the high robustness and fast processing speed of our approach, and provide useful insight into the effects of occlusion on FER. The results on the parameter sensitivity demonstrate a certain level of robustness of the approach to changes in the orientation and scale of Gabor filters, the size of templates, and occlusions ratios. Performance comparisons with previous approaches show that the proposed method is more robust to occlusion with lower reductions in accuracy from occlusion of eyes or mouth.
Resumo:
Facial expression recognition (FER) has been dramatically developed in recent years, thanks to the advancements in related fields, especially machine learning, image processing and human recognition. Accordingly, the impact and potential usage of automatic FER have been growing in a wide range of applications, including human-computer interaction, robot control and driver state surveillance. However, to date, robust recognition of facial expressions from images and videos is still a challenging task due to the difficulty in accurately extracting the useful emotional features. These features are often represented in different forms, such as static, dynamic, point-based geometric or region-based appearance. Facial movement features, which include feature position and shape changes, are generally caused by the movements of facial elements and muscles during the course of emotional expression. The facial elements, especially key elements, will constantly change their positions when subjects are expressing emotions. As a consequence, the same feature in different images usually has different positions. In some cases, the shape of the feature may also be distorted due to the subtle facial muscle movements. Therefore, for any feature representing a certain emotion, the geometric-based position and appearance-based shape normally changes from one image to another image in image databases, as well as in videos. This kind of movement features represents a rich pool of both static and dynamic characteristics of expressions, which playa critical role for FER. The vast majority of the past work on FER does not take the dynamics of facial expressions into account. Some efforts have been made on capturing and utilizing facial movement features, and almost all of them are static based. These efforts try to adopt either geometric features of the tracked facial points, or appearance difference between holistic facial regions in consequent frames or texture and motion changes in loca- facial regions. Although achieved promising results, these approaches often require accurate location and tracking of facial points, which remains problematic.
Resumo:
To evaluate the validity of the ActiGraph accelerometer for the measurement of physical activity intensity in children and adolescents with cerebral palsy (CP) using oxygen uptake (VO 2) as the criterion measure. Thirty children and adolescents with CP (mean age 12.6 ± 2.0 years) wore an ActiGraph 7164 and a Cosmed K4b 2 portable indirect calorimeter during four activities; quiet sitting, comfortable paced walking, brisk paced walking and fast paced walking. VO 2 was converted to METs and activity energy expenditure and classiWed as sedentary, light or moderate-to-vigorous intensity according to the conventions for children. Mean ActiGraph counts min -1 were classiWed as sedentary, light or moderate-to-vigorous (MVPA) intensity using four diVerent sets of cut-points. VO 2 and counts min¡1 increased signiWcantly with increases in walking speed (P < 0.001). Receiver operating characteristic (ROC) curve analysis indicated that, of the four sets of cut-points evaluated, the Evenson et al. (J Sports Sci 26(14):1557-1565, 2008) cut-points had the highest classiWcation accuracy for sedentary (92%) and MVPA (91%), as well as the second highest classiWcation accuracy for light intensity physical activity (67%). A ROC curve analysis of data from our participants yielded a CP-speciWc cut-point for MVPA that was lower than the Evenson cut-point (2,012 vs. 2,296 counts min¡1), however, the diVerence in classiWcation accuracy was not statistically signiWcant 94% (95% CI = 88.2-97.7%) vs. 91% (95% CI = 83.5-96.5%). In conclusion, among children and adolescents with CP, the ActiGraph is able to diVerentiate between diVerent intensities of walking. The use of the Evenson cut-points will permit the estimation of time spent in MVPA and allows comparisons to be made between activity measured in typically developing adolescents and adolescents with CP. © 2011 Springer-Verlag.
Resumo:
Background Promoting participation physical activity (PA) is an important means of promoting healthy growth and development in children with cerebral palsy (CP). The ActiGraph is a uniaxial accelerometer that provides a realtime measure of PA intensity, duration and frequency. Its small, light weight design makes it a promising measure of activity in children with CP. To date no study has validated the use of accelerometry as a measure of PA in ambulant adolescents with CP. Objectives To evaluate the validity of the ActiGraph accelerometer for measuring PA intensity in adolescents with CP, using oxygen consumption (VO2), measured using portable indirect calorimetry (Cosmed K4b2), as the criterion measure. Design Validation Study Participants/Setting: Ambulant adolescents with CP aged 10–16 years, GMFCS rating of I-III. The recruitment target is 30 (10 in each GMFCS level). Materials/Methods Participants wore the ActiGraph (counts/min) and a Cosmed K4b2 indirect calorimeter (mL/kg/min) during six activity trials: quiet sitting (QS), comfortable paced walking (CPW), brisk paced walking (BPW), fast paced walking (FPW), a ball-kicking protocol (KP) and a ball-throwing protocol (TP). MET levels (multiples of resting metabolism) for each activity were predicted from ActiGraph counts using the Freedson age-specific equation (Freedson et al. 2005) and compared with actual MET levels measured by the Cosmed. Predicted and measured METs for each activity trial were classified as light (> 1.5 METs and <4.6 METs) or moderate to vigorous intensity (≥ 4.6 METs). Results To date 36 bouts of activity have been completed (6 participants x 6 activities). Mean VO2 increased linearly as the intensity of the walking activity increased (CPW=9.47±2.16, BPW=14.06±4.38, FPW=19.21±5.68 ml/kg/min) and ActiGraph counts reflected this pattern (CPW=1099±574, BPW=2233±797 FPW=4707±1013 counts/min). The throwing protocol recording the lowest VO2 (TP=7.50±3.86 ml/kg/min) and lowest overall counts/min (TP=31±27 counts/min). When each of the 36 bouts were classified as either light or moderate to vigorous intensity using measured VO2 as the criterion measure, the Freedson equation correctly classified 28 from 36 bouts (78%). Conclusion/Clinical Implications These preliminary findings suggest that there is a relationship between the intensity of PA and direct measure of oxygen consumption and that therefore the ActiGraph may be a promising tool for accurately measuring free living PA in the community. Further data collection of the complete sample will enable secondary analysis of the relationship between PA and severity of CP (GMFCS level).
Resumo:
The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.
Resumo:
We employed a novel cuing paradigm to assess whether dynamically versus statically presented facial expressions differentially engaged predictive visual mechanisms. Participants were presented with a cueing stimulus that was either the static depiction of a low intensity expressed emotion; or a dynamic sequence evolving from a neutral expression to the low intensity expressed emotion. Following this cue and a backwards mask, participants were presented with a probe face that displayed either the same emotion (congruent) or a different emotion (incongruent) with respect to that displayed by the cue although expressed at a high intensity. The probe face had either the same or different identity from the cued face. The participants' task was to indicate whether or not the probe face showed the same emotion as the cue. Dynamic cues and same identity cues both led to a greater tendency towards congruent responding, although these factors did not interact. Facial motion also led to faster responding when the probe face was emotionally congruent to the cue. We interpret these results as indicating that dynamic facial displays preferentially invoke predictive visual mechanisms, and suggest that motoric simulation may provide an important basis for the generation of predictions in the visual system.
Resumo:
Emotionally arousing events can distort our sense of time. We used mixed block/event-related fMRI design to establish the neural basis for this effect. Nineteen participants were asked to judge whether angry, happy and neutral facial expressions that varied in duration (from 400 to 1,600 ms) were closer in duration to either a short or long duration they learnt previously. Time was overestimated for both angry and happy expressions compared to neutral expressions. For faces presented for 700 ms, facial emotion modulated activity in regions of the timing network Wiener et al. (NeuroImage 49(2):1728–1740, 2010) namely the right supplementary motor area (SMA) and the junction of the right inferior frontal gyrus and anterior insula (IFG/AI). Reaction times were slowest when faces were displayed for 700 ms indicating increased decision making difficulty. Taken together with existing electrophysiological evidence Ng et al. (Neuroscience, doi: 10.3389/fnint.2011.00077, 2011), the effects are consistent with the idea that facial emotion moderates temporal decision making and that the right SMA and right IFG/AI are key neural structures responsible for this effect.
Resumo:
Because moving depictions of face emotion have greater ecological validity than their static counterparts, it has been suggested that still photographs may not engage ‘authentic’ mechanisms used to recognize facial expressions in everyday life. To date, however, no neuroimaging studies have adequately addressed the question of whether the processing of static and dynamic expressions rely upon different brain substrates. To address this, we performed an functional magnetic resonance imaging (fMRI) experiment wherein participants made emotional expression discrimination and Sex discrimination judgements to static and moving face images. Compared to Sex discrimination, Emotion discrimination was associated with widespread increased activation in regions of occipito-temporal, parietal and frontal cortex. These regions were activated both by moving and by static emotional stimuli, indicating a general role in the interpretation of emotion. However, portions of the inferior frontal gyri and supplementary/pre-supplementary motor area showed task by motion interaction. These regions were most active during emotion judgements to static faces. Our results demonstrate a common neural substrate for recognizing static and moving facial expressions, but suggest a role for the inferior frontal gyrus in supporting simulation processes that are invoked more strongly to disambiguate static emotional cues.