118 resultados para SIFT keypoints


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spontaneous facial expressions differ from posed ones in appearance, timing and accompanying head movements. Still images cannot provide timing or head movement information directly. However, indirectly the distances between key points on a face extracted from a still image using active shape models can capture some movement and pose changes. This information is superposed on information about non-rigid facial movement that is also part of the expression. Does geometric information improve the discrimination between spontaneous and posed facial expressions arising from discrete emotions? We investigate the performance of a machine vision system for discrimination between posed and spontaneous versions of six basic emotions that uses SIFT appearance based features and FAP geometric features. Experimental results on the NVIE database demonstrate that fusion of geometric information leads only to marginal improvement over appearance features. Using fusion features, surprise is the easiest emotion (83.4% accuracy) to be distinguished, while disgust is the most difficult (76.1%). Our results find different important facial regions between discriminating posed versus spontaneous version of one emotion and classifying the same emotion versus other emotions. The distribution of the selected SIFT features shows that mouth is more important for sadness, while nose is more important for surprise, however, both the nose and mouth are important for disgust, fear, and happiness. Eyebrows, eyes, nose and mouth are important for anger.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Facial expression recognition (FER) algorithms mainly focus on classification into a small discrete set of emotions or representation of emotions using facial action units (AUs). Dimensional representation of emotions as continuous values in an arousal-valence space is relatively less investigated. It is not fully known whether fusion of geometric and texture features will result in better dimensional representation of spontaneous emotions. Moreover, the performance of many previously proposed approaches to dimensional representation has not been evaluated thoroughly on publicly available databases. To address these limitations, this paper presents an evaluation framework for dimensional representation of spontaneous facial expressions using texture and geometric features. SIFT, Gabor and LBP features are extracted around facial fiducial points and fused with FAP distance features. The CFS algorithm is adopted for discriminative texture feature selection. Experimental results evaluated on the publicly accessible NVIE database demonstrate that fusion of texture and geometry does not lead to a much better performance than using texture alone, but does result in a significant performance improvement over geometry alone. LBP features perform the best when fused with geometric features. Distributions of arousal and valence for different emotions obtained via the feature extraction process are compared with those obtained from subjective ground truth values assigned by viewers. Predicted valence is found to have a more similar distribution to ground truth than arousal in terms of covariance or Bhattacharya distance, but it shows a greater distance between the means.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Feature extraction and selection are critical processes in developing facial expression recognition (FER) systems. While many algorithms have been proposed for these processes, direct comparison between texture, geometry and their fusion, as well as between multiple selection algorithms has not been found for spontaneous FER. This paper addresses this issue by proposing a unified framework for a comparative study on the widely used texture (LBP, Gabor and SIFT) and geometric (FAP) features, using Adaboost, mRMR and SVM feature selection algorithms. Our experiments on the Feedtum and NVIE databases demonstrate the benefits of fusing geometric and texture features, where SIFT+FAP shows the best performance, while mRMR outperforms Adaboost and SVM. In terms of computational time, LBP and Gabor perform better than SIFT. The optimal combination of SIFT+FAP+mRMR also exhibits a state-of-the-art performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work aims to contribute to the reliability and integrity of perceptual systems of unmanned ground vehicles (UGV). A method is proposed to evaluate the quality of sensor data prior to its use in a perception system by utilising a quality metric applied to heterogeneous sensor data such as visual and infrared camera images. The concept is illustrated specifically with sensor data that is evaluated prior to the use of the data in a standard SIFT feature extraction and matching technique. The method is then evaluated using various experimental data sets that were collected from a UGV in challenging environmental conditions, represented by the presence of airborne dust and smoke. In the first series of experiments, a motionless vehicle is observing a ’reference’ scene, then the method is extended to the case of a moving vehicle by compensating for its motion. This paper shows that it is possible to anticipate degradation of a perception algorithm by evaluating the input data prior to any actual execution of the algorithm.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a determination of Delta(f)H(298)(HOO) based upon a negative. ion thermodynamic cycle. The photoelectron spectra of HOO- and DOO- were used to measure the molecular electron affinities (EAs). In a separate experiment, a tandem flowing afterglow-selected ion flow tube (FA-SIFT) was used to measure the forward and reverse rate constants for HOO- + HCdropCH reversible arrow HOOH + HCdropC(-) at 298 K, which gave a value for Delta(acid)H(298)(HOO-H). The experiments yield the following values: EA(HOO) = 1.078 +/- 0.006 eV; T-0((X) over tilde HOO - (A) over tilde HOO) = 0.872 +/- 0.007 eV; EA(DOO) = 1.077 +/- 0.005 eV; T-0((X) over tilde DOO - (A) over tilde DOO) = 0.874 +/- 0.007 eV; Delta(acid)G(298)(HOO-H) = 369.5 +/- 0.4 kcal mol(-1); and Delta(acid)H(298)(HOO-H) = 376.5 +/- 0.4 kcal mol(-1). The acidity/EA thermochemical cycle yields values for the bond enthalpies of DH298(HOO-H) = 87.8 +/- 0.5 kcal mol(-1) and Do(HOO-H) = 86.6 +/- 0.5 kcal mol(-1). We recommend the following values for the heats of formation of the hydroperoxyl radical: Delta(f)H(298)(HOO) = 3.2 +/- 0.5 kcal mol(-1) and Delta(f)H(0)(HOO) = 3.9 +/- 0.5 kcal mol(-1); we recommend that these values supersede those listed in the current NIST-JANAF thermochemical tables.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Methyl, methyl-d(3), and ethyl hydroperoxide anions (CH3OO-, CD3OO-, and CH3CH2OO-) have been prepared by deprotonation of their respective hydroperoxides in a stream of helium buffer, gas. Photodetachment with 364 nm (3.408 eV) radiation was used to measure the adiabatic electron affinities: EA[CH3OO, (X) over tilde (2)A"] = 1.161 +/- 0.005 eV, EA[CD3OO, (X) over tilde (2)A"] = 1.154 +/- 0.004 eV, and EA[CH3CH2OO, (X) over tilde (2)A"] = 1.186 +/- 0.004 eV. The photoelectron spectra yield values for the term energies: DeltaE((X) over tilde 2A"-(A) over tilde 2A')[CH3OO] = 0.914 +/- 0.005 eV, DeltaE((X) over tilde (2)A"-(A) over tilde 2A') [CD3OO] = 0.913 +/- 0.004 eV, and DeltaE((X) over tilde (2)A"-(A) over tilde (2)A')[CH3CH2OO] = 0.938 +/- 0.004 eV. A localized RO-O stretching mode was observed near 1100 cm(-1) for the ground state of all three radicals, and low-frequency R-O-O bending modes are also reported. Proton-transfer kinetics of the hydroperoxides have been measured in a tandem flowing afterglow-selected ion flow tube k(FA-SIFT) to determine the gas-phase acidity of the parent hydroperoxides: Delta (acid)G(298)(CH3OOH) = 367.6 +/- 0.7 kcal mol(-1), Delta (acid)G(298)(CD3OOH) = 367.9 +/- 0.9 kcal mol(-1), and Delta (acid)G(298)(CH3CH2OOH) = 363.9 +/- 2.0 kcal mol(-1). From these acidities we have derived the enthalpies of deprotonation: Delta H-acid(298)(CH3OOH) = 374.6 +/- 1.0 kcal mol(-1), Delta H-acid(298)(CD3OOH) = 374.9 +/- 1.1 kcal mol(-1), and Delta H-acid(298)(CH2CH3OOH) = 371.0 +/- 2.2 kcal mol(-1). Use of the negative-ion acidity/EA cycle provides the ROO-H bond enthalpies: DH298(CH3OO-H) 87.8 +/- 1.0 kcal mol(-1), DH298(CD3OO-H) = 87.9 +/- 1.1 kcal mol(-1), and DH298(CH3CH2OO-H) = 84.8 +/- 2.2 kcal mol(-1). We review the thermochemistry of the peroxyl radicals, CH3OO and CH3CH2OO. Using experimental bond enthalpies, DH298(ROO-H), and CBS/APNO ab initio electronic structure calculations for the energies of the corresponding hydroperoxides, we derive the heats of formation of the peroxyl radicals. The "electron affinity/acidity/CBS" cycle yields Delta H-f(298)[CH3OO] = 4.8 +/- 1.2 kcal mol(-1) and Delta H-f(298)[CH3CH2OO] = -6.8 +/- 2.3 kcal mol(-1).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The collision-induced dissociation ( CID) mass spectra of the \[M-H](-) anions of methyl, ethyl, and tert-butyl hydroperoxides have been measured over a range of collision energies in a flowing afterglow - selected ion flow tube (FA-SIFT) mass spectrometer. Activation of the CH3OO- anion is found to give predominantly HO- fragment anions whilst CH3CH2OO- and (CH3)(3)COO- produce HOO- as the major ionic fragment. These results, and other minor fragmentation pathways, can be rationalized in terms of unimolecular rearrangement of the activated anions with subsequent decomposition. The rearrangement reactions occur via initial abstraction of a proton from the alpha-carbon in the case of CH3OO- or the beta-carbon for CH3CH2OO- and (CH3)(3)COO-. Electronic structure calculations suggest that for the CH3CH2OO- anion, which can theoretically undergo both alpha- and beta-proton abstraction, the latter pathway, resulting in HOO- + CH2CH2, is energetically preferred.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sparse optical flow algorithms, such as the Lucas-Kanade approach, provide more robustness to noise than dense optical flow algorithms and are the preferred approach in many scenarios. Sparse optical flow algorithms estimate the displacement for a selected number of pixels in the image. These pixels can be chosen randomly. However, pixels in regions with more variance between the neighbours will produce more reliable displacement estimates. The selected pixel locations should therefore be chosen wisely. In this study, the suitability of Harris corners, Shi-Tomasi's “Good features to track", SIFT and SURF interest point extractors, Canny edges, and random pixel selection for the purpose of frame-by-frame tracking using a pyramidical Lucas-Kanade algorithm is investigated. The evaluation considers the important factors of processing time, feature count, and feature trackability in indoor and outdoor scenarios using ground vehicles and unmanned aerial vehicles, and for the purpose of visual odometry estimation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal-valence dimensional space (AV). The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new metric, which we call the lighting variance ratio, for quantifying descriptors in terms of their variance to illumination changes. In many applications it is desirable to have descriptors that are robust to changes in illumination, especially in outdoor environments. The lighting variance ratio is useful for comparing descriptors and determining if a descriptor is lighting invariant enough for a given environment. The metric is analysed across a number of datasets, cameras and descriptors. The results show that the upright SIFT descriptor is typically the most lighting invariant descriptor.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND For engineering graduates to be work-ready with marketable skills they must not only be well-versed with engineering science and its applications, but also able to adapt to using commercial software that is widely used in engineering practice. Hydrological/hydraulic modelling is one aspect of engineering practice which demands the ability to apply fundamentals into design and construction using software. The user manuals for such software are usually tailored for the experienced engineer but not for undergraduates who typically are novices to concepts of modelling and software tools. As the focus of a course such as Advanced Water Engineering is on the wider aspects of engineering application of hydrological and hydraulic concepts, it is ineffective for the lecturers to direct the students to user manuals as students have neither the time nor the desire to sift through numerous pages in a manual. An alternative and efficient way to demonstrate the use of the software is enabling students to develop a model to simulate real-world scenario using the tools of the software and directing them to make informed decisions based on outcomes. PURPOSE Past experience of the lecturer showed that the resources available for the students left a knowledge gap leading to numerous student queries outside contact hours. The purpose of this study is to assess how effective purpose-built video resources can be in supplementing the traditional learning resources to enhance student learning. APPROACH Short-length animated video clips comprising guided step-by-step instructions were prepared using screen capture software to capture screen activity and later edited to focus on specific features using pop-up annotations; Vocal narration was purposely excluded to avoid disturbances due to noise and allow different learning paces of individual students. The video clips were made available to the students alongside the traditional resources/approaches such as in-class demonstrations, guideline notes, and tips for efficient and error-free procedural descriptions. The number of queries the lecturer received from the student cohort outside the lecture times was recorded. An anonymous survey to assess the usefulness and adequacy of the courseware was conducted. OUTCOMES While a significant decline in the number of student queries was noted, an overwhelming majority of the survey respondents confirmed the usefulness of the purpose-developed courseware. CONCLUSIONS/RECOMMENDATIONS/SUMMARY The survey and lecturer’s experience indicated that animated demonstration video clips illustrating the various steps involved in developing hydrologic and hydraulic models and simulating design scenarios is an effective supplement for traditional learning resources. Among the many advantages of the custom-made video clips as a learning resource are that they (1) highlight the aspects that are important to undergraduate learning but not available in the software manuals as the latter are designed for more mature users/learners; (2) provide short, to-the point communication in a step-by-step manner; (3) allow students flexibility to self-learn at their own pace; (4) enhance student learning; and (5) enable time savings for the lecturer in the long term by avoiding queries of a repetitive nature. It is expected that these newly developed resources will be improved to incorporate students’ suggestions before being offered to future cohorts of students. The concept can also be expanded to other relevant courses where animated demonstrations of key modelling steps are beneficial to student learning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a vision-only system for place recognition in environments that are tra- versed at different times of day, when chang- ing conditions drastically affect visual appear- ance, and at different speeds, where places aren’t visited at a consistent linear rate. The ma- jor contribution is the removal of wheel-based odometry from the previously presented algo- rithm (SMART), allowing the technique to op- erate on any camera-based device; in our case a mobile phone. While we show that the di- rect application of visual odometry to our night- time datasets does not achieve a level of perfor- mance typically needed, the VO requirements of SMART are orthogonal to typical usage: firstly only the magnitude of the velocity is required, and secondly the calculated velocity signal only needs to be repeatable in any one part of the environment over day and night cycles, but not necessarily globally consistent. Our results show that the smoothing effect of motion constraints is highly beneficial for achieving a locally consis- tent, lighting-independent velocity estimate. We also show that the advantage of our patch-based technique used previously for frame recogni- tion, surprisingly, does not transfer to VO, where SIFT demonstrates equally good performance. Nevertheless, we present the SMART system us- ing only vision, which performs sequence-base place recognition in extreme low-light condi- tions where standard 6-DOF VO fails and that improves place recognition performance over odometry-less benchmarks, approaching that of wheel odometry.