16 resultados para Sift

em Queensland University of Technology - ePrints Archive


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Wide-angle images exhibit significant distortion for which existing scale-space detectors such as the scale-invariant feature transform (SIFT) are inappropriate. The required scale-space images for feature detection are correctly obtained through the convolution of the image, mapped to the sphere, with the spherical Gaussian. A new visual key-point detector, based on this principle, is developed and several computational approaches to the convolution are investigated in both the spatial and frequency domain. In particular, a close approximation is developed that has comparable computation time to conventional SIFT but with improved matching performance. Results are presented for monocular wide-angle outdoor image sequences obtained using fisheye and equiangular catadioptric cameras. We evaluate the overall matching performance (recall versus 1-precision) of these methods compared to conventional SIFT. We also demonstrate the use of the technique for variable frame-rate visual odometry and its application to place recognition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis addresses the problem of detecting and describing the same scene points in different wide-angle images taken by the same camera at different viewpoints. This is a core competency of many vision-based localisation tasks including visual odometry and visual place recognition. Wide-angle cameras have a large field of view that can exceed a full hemisphere, and the images they produce contain severe radial distortion. When compared to traditional narrow field of view perspective cameras, more accurate estimates of camera egomotion can be found using the images obtained with wide-angle cameras. The ability to accurately estimate camera egomotion is a fundamental primitive of visual odometry, and this is one of the reasons for the increased popularity in the use of wide-angle cameras for this task. Their large field of view also enables them to capture images of the same regions in a scene taken at very different viewpoints, and this makes them suited for visual place recognition. However, the ability to estimate the camera egomotion and recognise the same scene in two different images is dependent on the ability to reliably detect and describe the same scene points, or ‘keypoints’, in the images. Most algorithms used for this purpose are designed almost exclusively for perspective images. Applying algorithms designed for perspective images directly to wide-angle images is problematic as no account is made for the image distortion. The primary contribution of this thesis is the development of two novel keypoint detectors, and a method of keypoint description, designed for wide-angle images. Both reformulate the Scale- Invariant Feature Transform (SIFT) as an image processing operation on the sphere. As the image captured by any central projection wide-angle camera can be mapped to the sphere, applying these variants to an image on the sphere enables keypoints to be detected in a manner that is invariant to image distortion. Each of the variants is required to find the scale-space representation of an image on the sphere, and they differ in the approaches they used to do this. Extensive experiments using real and synthetically generated wide-angle images are used to validate the two new keypoint detectors and the method of keypoint description. The best of these two new keypoint detectors is applied to vision based localisation tasks including visual odometry and visual place recognition using outdoor wide-angle image sequences. As part of this work, the effect of keypoint coordinate selection on the accuracy of egomotion estimates using the Direct Linear Transform (DLT) is investigated, and a simple weighting scheme is proposed which attempts to account for the uncertainty of keypoint positions during detection. A word reliability metric is also developed for use within a visual ‘bag of words’ approach to place recognition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spontaneous facial expressions differ from posed ones in appearance, timing and accompanying head movements. Still images cannot provide timing or head movement information directly. However, indirectly the distances between key points on a face extracted from a still image using active shape models can capture some movement and pose changes. This information is superposed on information about non-rigid facial movement that is also part of the expression. Does geometric information improve the discrimination between spontaneous and posed facial expressions arising from discrete emotions? We investigate the performance of a machine vision system for discrimination between posed and spontaneous versions of six basic emotions that uses SIFT appearance based features and FAP geometric features. Experimental results on the NVIE database demonstrate that fusion of geometric information leads only to marginal improvement over appearance features. Using fusion features, surprise is the easiest emotion (83.4% accuracy) to be distinguished, while disgust is the most difficult (76.1%). Our results find different important facial regions between discriminating posed versus spontaneous version of one emotion and classifying the same emotion versus other emotions. The distribution of the selected SIFT features shows that mouth is more important for sadness, while nose is more important for surprise, however, both the nose and mouth are important for disgust, fear, and happiness. Eyebrows, eyes, nose and mouth are important for anger.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Facial expression recognition (FER) algorithms mainly focus on classification into a small discrete set of emotions or representation of emotions using facial action units (AUs). Dimensional representation of emotions as continuous values in an arousal-valence space is relatively less investigated. It is not fully known whether fusion of geometric and texture features will result in better dimensional representation of spontaneous emotions. Moreover, the performance of many previously proposed approaches to dimensional representation has not been evaluated thoroughly on publicly available databases. To address these limitations, this paper presents an evaluation framework for dimensional representation of spontaneous facial expressions using texture and geometric features. SIFT, Gabor and LBP features are extracted around facial fiducial points and fused with FAP distance features. The CFS algorithm is adopted for discriminative texture feature selection. Experimental results evaluated on the publicly accessible NVIE database demonstrate that fusion of texture and geometry does not lead to a much better performance than using texture alone, but does result in a significant performance improvement over geometry alone. LBP features perform the best when fused with geometric features. Distributions of arousal and valence for different emotions obtained via the feature extraction process are compared with those obtained from subjective ground truth values assigned by viewers. Predicted valence is found to have a more similar distribution to ground truth than arousal in terms of covariance or Bhattacharya distance, but it shows a greater distance between the means.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Feature extraction and selection are critical processes in developing facial expression recognition (FER) systems. While many algorithms have been proposed for these processes, direct comparison between texture, geometry and their fusion, as well as between multiple selection algorithms has not been found for spontaneous FER. This paper addresses this issue by proposing a unified framework for a comparative study on the widely used texture (LBP, Gabor and SIFT) and geometric (FAP) features, using Adaboost, mRMR and SVM feature selection algorithms. Our experiments on the Feedtum and NVIE databases demonstrate the benefits of fusing geometric and texture features, where SIFT+FAP shows the best performance, while mRMR outperforms Adaboost and SVM. In terms of computational time, LBP and Gabor perform better than SIFT. The optimal combination of SIFT+FAP+mRMR also exhibits a state-of-the-art performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Robust descriptor matching across varying lighting conditions is important for vision-based robotics. We present a novel strategy for quantifying the lighting variance of descriptors. The strategy works by utilising recovered low dimensional mappings from Isomap and our measure of the lighting variance of each of these mappings. The resultant metric allows different descriptors to be compared given a dataset and a set of keypoints. We demonstrate that the SIFT descriptor typically has lower lighting variance than other descriptors, although the result depends on semantic class and lighting conditions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work aims to contribute to the reliability and integrity of perceptual systems of unmanned ground vehicles (UGV). A method is proposed to evaluate the quality of sensor data prior to its use in a perception system by utilising a quality metric applied to heterogeneous sensor data such as visual and infrared camera images. The concept is illustrated specifically with sensor data that is evaluated prior to the use of the data in a standard SIFT feature extraction and matching technique. The method is then evaluated using various experimental data sets that were collected from a UGV in challenging environmental conditions, represented by the presence of airborne dust and smoke. In the first series of experiments, a motionless vehicle is observing a ’reference’ scene, then the method is extended to the case of a moving vehicle by compensating for its motion. This paper shows that it is possible to anticipate degradation of a perception algorithm by evaluating the input data prior to any actual execution of the algorithm.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a determination of Delta(f)H(298)(HOO) based upon a negative. ion thermodynamic cycle. The photoelectron spectra of HOO- and DOO- were used to measure the molecular electron affinities (EAs). In a separate experiment, a tandem flowing afterglow-selected ion flow tube (FA-SIFT) was used to measure the forward and reverse rate constants for HOO- + HCdropCH reversible arrow HOOH + HCdropC(-) at 298 K, which gave a value for Delta(acid)H(298)(HOO-H). The experiments yield the following values: EA(HOO) = 1.078 +/- 0.006 eV; T-0((X) over tilde HOO - (A) over tilde HOO) = 0.872 +/- 0.007 eV; EA(DOO) = 1.077 +/- 0.005 eV; T-0((X) over tilde DOO - (A) over tilde DOO) = 0.874 +/- 0.007 eV; Delta(acid)G(298)(HOO-H) = 369.5 +/- 0.4 kcal mol(-1); and Delta(acid)H(298)(HOO-H) = 376.5 +/- 0.4 kcal mol(-1). The acidity/EA thermochemical cycle yields values for the bond enthalpies of DH298(HOO-H) = 87.8 +/- 0.5 kcal mol(-1) and Do(HOO-H) = 86.6 +/- 0.5 kcal mol(-1). We recommend the following values for the heats of formation of the hydroperoxyl radical: Delta(f)H(298)(HOO) = 3.2 +/- 0.5 kcal mol(-1) and Delta(f)H(0)(HOO) = 3.9 +/- 0.5 kcal mol(-1); we recommend that these values supersede those listed in the current NIST-JANAF thermochemical tables.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Methyl, methyl-d(3), and ethyl hydroperoxide anions (CH3OO-, CD3OO-, and CH3CH2OO-) have been prepared by deprotonation of their respective hydroperoxides in a stream of helium buffer, gas. Photodetachment with 364 nm (3.408 eV) radiation was used to measure the adiabatic electron affinities: EA[CH3OO, (X) over tilde (2)A"] = 1.161 +/- 0.005 eV, EA[CD3OO, (X) over tilde (2)A"] = 1.154 +/- 0.004 eV, and EA[CH3CH2OO, (X) over tilde (2)A"] = 1.186 +/- 0.004 eV. The photoelectron spectra yield values for the term energies: DeltaE((X) over tilde 2A"-(A) over tilde 2A')[CH3OO] = 0.914 +/- 0.005 eV, DeltaE((X) over tilde (2)A"-(A) over tilde 2A') [CD3OO] = 0.913 +/- 0.004 eV, and DeltaE((X) over tilde (2)A"-(A) over tilde (2)A')[CH3CH2OO] = 0.938 +/- 0.004 eV. A localized RO-O stretching mode was observed near 1100 cm(-1) for the ground state of all three radicals, and low-frequency R-O-O bending modes are also reported. Proton-transfer kinetics of the hydroperoxides have been measured in a tandem flowing afterglow-selected ion flow tube k(FA-SIFT) to determine the gas-phase acidity of the parent hydroperoxides: Delta (acid)G(298)(CH3OOH) = 367.6 +/- 0.7 kcal mol(-1), Delta (acid)G(298)(CD3OOH) = 367.9 +/- 0.9 kcal mol(-1), and Delta (acid)G(298)(CH3CH2OOH) = 363.9 +/- 2.0 kcal mol(-1). From these acidities we have derived the enthalpies of deprotonation: Delta H-acid(298)(CH3OOH) = 374.6 +/- 1.0 kcal mol(-1), Delta H-acid(298)(CD3OOH) = 374.9 +/- 1.1 kcal mol(-1), and Delta H-acid(298)(CH2CH3OOH) = 371.0 +/- 2.2 kcal mol(-1). Use of the negative-ion acidity/EA cycle provides the ROO-H bond enthalpies: DH298(CH3OO-H) 87.8 +/- 1.0 kcal mol(-1), DH298(CD3OO-H) = 87.9 +/- 1.1 kcal mol(-1), and DH298(CH3CH2OO-H) = 84.8 +/- 2.2 kcal mol(-1). We review the thermochemistry of the peroxyl radicals, CH3OO and CH3CH2OO. Using experimental bond enthalpies, DH298(ROO-H), and CBS/APNO ab initio electronic structure calculations for the energies of the corresponding hydroperoxides, we derive the heats of formation of the peroxyl radicals. The "electron affinity/acidity/CBS" cycle yields Delta H-f(298)[CH3OO] = 4.8 +/- 1.2 kcal mol(-1) and Delta H-f(298)[CH3CH2OO] = -6.8 +/- 2.3 kcal mol(-1).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The collision-induced dissociation ( CID) mass spectra of the \[M-H](-) anions of methyl, ethyl, and tert-butyl hydroperoxides have been measured over a range of collision energies in a flowing afterglow - selected ion flow tube (FA-SIFT) mass spectrometer. Activation of the CH3OO- anion is found to give predominantly HO- fragment anions whilst CH3CH2OO- and (CH3)(3)COO- produce HOO- as the major ionic fragment. These results, and other minor fragmentation pathways, can be rationalized in terms of unimolecular rearrangement of the activated anions with subsequent decomposition. The rearrangement reactions occur via initial abstraction of a proton from the alpha-carbon in the case of CH3OO- or the beta-carbon for CH3CH2OO- and (CH3)(3)COO-. Electronic structure calculations suggest that for the CH3CH2OO- anion, which can theoretically undergo both alpha- and beta-proton abstraction, the latter pathway, resulting in HOO- + CH2CH2, is energetically preferred.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sparse optical flow algorithms, such as the Lucas-Kanade approach, provide more robustness to noise than dense optical flow algorithms and are the preferred approach in many scenarios. Sparse optical flow algorithms estimate the displacement for a selected number of pixels in the image. These pixels can be chosen randomly. However, pixels in regions with more variance between the neighbours will produce more reliable displacement estimates. The selected pixel locations should therefore be chosen wisely. In this study, the suitability of Harris corners, Shi-Tomasi's “Good features to track", SIFT and SURF interest point extractors, Canny edges, and random pixel selection for the purpose of frame-by-frame tracking using a pyramidical Lucas-Kanade algorithm is investigated. The evaluation considers the important factors of processing time, feature count, and feature trackability in indoor and outdoor scenarios using ground vehicles and unmanned aerial vehicles, and for the purpose of visual odometry estimation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal-valence dimensional space (AV). The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a new metric, which we call the lighting variance ratio, for quantifying descriptors in terms of their variance to illumination changes. In many applications it is desirable to have descriptors that are robust to changes in illumination, especially in outdoor environments. The lighting variance ratio is useful for comparing descriptors and determining if a descriptor is lighting invariant enough for a given environment. The metric is analysed across a number of datasets, cameras and descriptors. The results show that the upright SIFT descriptor is typically the most lighting invariant descriptor.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND For engineering graduates to be work-ready with marketable skills they must not only be well-versed with engineering science and its applications, but also able to adapt to using commercial software that is widely used in engineering practice. Hydrological/hydraulic modelling is one aspect of engineering practice which demands the ability to apply fundamentals into design and construction using software. The user manuals for such software are usually tailored for the experienced engineer but not for undergraduates who typically are novices to concepts of modelling and software tools. As the focus of a course such as Advanced Water Engineering is on the wider aspects of engineering application of hydrological and hydraulic concepts, it is ineffective for the lecturers to direct the students to user manuals as students have neither the time nor the desire to sift through numerous pages in a manual. An alternative and efficient way to demonstrate the use of the software is enabling students to develop a model to simulate real-world scenario using the tools of the software and directing them to make informed decisions based on outcomes. PURPOSE Past experience of the lecturer showed that the resources available for the students left a knowledge gap leading to numerous student queries outside contact hours. The purpose of this study is to assess how effective purpose-built video resources can be in supplementing the traditional learning resources to enhance student learning. APPROACH Short-length animated video clips comprising guided step-by-step instructions were prepared using screen capture software to capture screen activity and later edited to focus on specific features using pop-up annotations; Vocal narration was purposely excluded to avoid disturbances due to noise and allow different learning paces of individual students. The video clips were made available to the students alongside the traditional resources/approaches such as in-class demonstrations, guideline notes, and tips for efficient and error-free procedural descriptions. The number of queries the lecturer received from the student cohort outside the lecture times was recorded. An anonymous survey to assess the usefulness and adequacy of the courseware was conducted. OUTCOMES While a significant decline in the number of student queries was noted, an overwhelming majority of the survey respondents confirmed the usefulness of the purpose-developed courseware. CONCLUSIONS/RECOMMENDATIONS/SUMMARY The survey and lecturer’s experience indicated that animated demonstration video clips illustrating the various steps involved in developing hydrologic and hydraulic models and simulating design scenarios is an effective supplement for traditional learning resources. Among the many advantages of the custom-made video clips as a learning resource are that they (1) highlight the aspects that are important to undergraduate learning but not available in the software manuals as the latter are designed for more mature users/learners; (2) provide short, to-the point communication in a step-by-step manner; (3) allow students flexibility to self-learn at their own pace; (4) enhance student learning; and (5) enable time savings for the lecturer in the long term by avoiding queries of a repetitive nature. It is expected that these newly developed resources will be improved to incorporate students’ suggestions before being offered to future cohorts of students. The concept can also be expanded to other relevant courses where animated demonstrations of key modelling steps are beneficial to student learning.