914 resultados para robust speech recognition
Resumo:
In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountain biking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.
Resumo:
This paper presents a novel technique for performing SLAM along a continuous trajectory of appearance. Derived from components of FastSLAM and FAB-MAP, the new system dubbed Continuous Appearance-based Trajectory SLAM (CAT-SLAM) augments appearancebased place recognition with particle-filter based ‘pose filtering’ within a probabilistic framework, without calculating global feature geometry or performing 3D map construction. For loop closure detection CAT-SLAM updates in constant time regardless of map size. We evaluate the effectiveness of CAT-SLAM on a 16km outdoor road network and determine its loop closure performance relative to FAB-MAP. CAT-SLAM recognizes 3 times the number of loop closures for the case where no false positives occur, demonstrating its potential use for robust loop closure detection in large environments.
Resumo:
This study investigated the ability of primary school teachers to recognise and refer children with anxiety symptoms. Two hundred and ninety-nine primary school teachers completed a questionnaire exploring their recognition and referral responses to five hypothetical vignettes that described boys and girls with varying severity of anxiety symptoms. Results revealed that teachers were generally able to recognise and make the decision to refer children with severe levels of anxiety. However, they had difficulty distinguishing between children with moderate anxiety symptoms and a severe anxiety disorder. Female teachers were more likely to refer children than were male teachers. The implications and future research are discussed.
Resumo:
Feature extraction and selection are critical processes in developing facial expression recognition (FER) systems. While many algorithms have been proposed for these processes, direct comparison between texture, geometry and their fusion, as well as between multiple selection algorithms has not been found for spontaneous FER. This paper addresses this issue by proposing a unified framework for a comparative study on the widely used texture (LBP, Gabor and SIFT) and geometric (FAP) features, using Adaboost, mRMR and SVM feature selection algorithms. Our experiments on the Feedtum and NVIE databases demonstrate the benefits of fusing geometric and texture features, where SIFT+FAP shows the best performance, while mRMR outperforms Adaboost and SVM. In terms of computational time, LBP and Gabor perform better than SIFT. The optimal combination of SIFT+FAP+mRMR also exhibits a state-of-the-art performance.
Resumo:
The low resolution of images has been one of the major limitations in recognising humans from a distance using their biometric traits, such as face and iris. Superresolution has been employed to improve the resolution and the recognition performance simultaneously, however the majority of techniques employed operate in the pixel domain, such that the biometric feature vectors are extracted from a super-resolved input image. Feature-domain superresolution has been proposed for face and iris, and is shown to further improve recognition performance by capitalising on direct super-resolving the features which are used for recognition. However, current feature-domain superresolution approaches are limited to simple linear features such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which are not the most discriminant features for biometrics. Gabor-based features have been shown to be one of the most discriminant features for biometrics including face and iris. This paper proposes a framework to conduct super-resolution in the non-linear Gabor feature domain to further improve the recognition performance of biometric systems. Experiments have confirmed the validity of the proposed approach, demonstrating superior performance to existing linear approaches for both face and iris biometrics.
Resumo:
Facial expression is one of the main issues of face recognition in uncontrolled environments. In this paper, we apply the probabilistic linear discriminant analysis (PLDA) method to recognize faces across expressions. Several PLDA approaches are tested and cross-evaluated on the Cohn-Kanade and JAFFE databases. With less samples per gallery subject, high recognition rates comparable to previous works have been achieved indicating the robustness of the approaches. Among the approaches, the mixture of PLDAs has demonstrated better performances. The experimental results also indicate that facial regions around the cheeks, eyes, and eyebrows are more discriminative than regions around the mouth, jaw, chin, and nose.
Resumo:
The paper investigates a detailed Active Shock Control Bump Design Optimisation on a Natural Laminar Flow (NLF) aerofoil; RAE 5243 to reduce cruise drag at transonic flow conditions using Evolutionary Algorithms (EAs) coupled to a robust design approach. For the uncertainty design parameters, the positions of boundary layer transition (xtr) and the coefficient of lift (Cl) are considered (250 stochastic samples in total). In this paper, two robust design methods are considered; the first approach uses a standard robust design method, which evaluates one design model at 250 stochastic conditions for uncertainty. The second approach is the combination of a standard robust design method and the concept of hierarchical (multi-population) sampling (250, 50, 15) for uncertainty. Numerical results show that the evolutionary optimization method coupled to uncertainty design techniques produces useful and reliable Pareto optimal SCB shapes which have low sensitivity and high aerodynamic performance while having significant total drag reduction. In addition,it also shows the benefit of using hierarchical robust method for detailed uncertainty design optimization.
Resumo:
Large margin learning approaches, such as support vector machines (SVM), have been successfully applied to numerous classification tasks, especially for automatic facial expression recognition. The risk of such approaches however, is their sensitivity to large margin losses due to the influence from noisy training examples and outliers which is a common problem in the area of affective computing (i.e., manual coding at the frame level is tedious so coarse labels are normally assigned). In this paper, we leverage the relaxation of the parallel-hyperplanes constraint and propose the use of modified correlation filters (MCF). The MCF is similar in spirit to SVMs and correlation filters, but with the key difference of optimizing only a single hyperplane. We demonstrate the superiority of MCF over current techniques on a battery of experiments.
Resumo:
This paper introduces the Weighted Linear Discriminant Analysis (WLDA) technique, based upon the weighted pairwise Fisher criterion, for the purposes of improving i-vector speaker verification in the presence of high intersession variability. By taking advantage of the speaker discriminative information that is available in the distances between pairs of speakers clustered in the development i-vector space, the WLDA technique is shown to provide an improvement in speaker verification performance over traditional Linear Discriminant Analysis (LDA) approaches. A similar approach is also taken to extend the recently developed Source Normalised LDA (SNLDA) into Weighted SNLDA (WSNLDA) which, similarly, shows an improvement in speaker verification performance in both matched and mismatched enrolment/verification conditions. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that both WLDA and WSNLDA are viable as replacement techniques to improve the performance of LDA and SNLDA-based i-vector speaker verification.
Resumo:
The importance of actively managing and analyzing business processes is acknowledged more than ever in organizations nowadays. Business processes form an essential part of an organization and their ap-plication areas are manifold. Most organizations keep records of various activities that have been carried out for auditing purposes, but they are rarely used for analysis purposes. This paper describes the design and implementation of a process analysis tool that replays, analyzes and visualizes a variety of performance metrics using a process definition and its execution logs. Performing performance analysis on existing and planned process models offers a great way for organizations to detect bottlenecks within their processes and allow them to make more effective process improvement decisions. Our technique is applied to processes modeled in the YAWL language. Execution logs of process instances are compared against the corresponding YAWL process model and replayed in a robust manner, taking into account any noise in the logs. Finally, performance characteristics, obtained from replaying the log in the model, are projected onto the model.
Resumo:
This paper illustrates robust fixed order power oscillation damper design for mitigating power systems oscillations. From implementation and tuning point of view, such low and fixed structure is common practice for most practical applications, including power systems. However, conventional techniques of optimal and robust control theory cannot handle the constraint of fixed-order as it is, in general, impossible to ensure a target closed-loop transfer function by a controller of any given order. This paper deals with the problem of synthesizing or designing a feedback controller of dynamic order for a linear time-invariant plant for a fixed plant, as well as for an uncertain family of plants containing parameter uncertainty, so that stability, robust stability and robust performance are attained. The desired closed-loop specifications considered here are given in terms of a target performance vector representing a desired closed-loop design. The performance of the designed controller is validated through non-linear simulations for a range of contingencies.
Resumo:
While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.
Resumo:
In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.