914 resultados para robust speech recognition
Resumo:
Human facial expression is a complex process characterized of dynamic, subtle and regional emotional features. State-of-the-art approaches on facial expression recognition (FER) have not fully utilized this kind of features to improve the recognition performance. This paper proposes an approach to overcome this limitation using patch-based ‘salient’ Gabor features. A set of 3D patches are extracted to represent the subtle and regional features, and then inputted into patch matching operations for capturing the dynamic features. Experimental results show a significant performance improvement of the proposed approach due to the use of the dynamic features. Performance comparison with pervious work also confirms that the proposed approach achieves the highest CRR reported to date on the JAFFE database and a top-level performance on the Cohn-Kanade (CK) database.
Resumo:
The present article, which is abstracted from a larger study into the acquisition and exercise of nephrology nursing expertise, aims to explore the concept of recognition of expertise. The study used grounded theory methodology and involved 17 registered nurses who were practising in a metropolitan renal unit in New South Wales, Australia. Concurrent data collection and analysis was undertaken, incorporating participant observations and interviews. According to nurses in this study, patients, doctors and other nurses recognized that some nurses were experts while others were not. In addition, being trusted, being a role model and teaching others were important components of being recognized as an expert nephrology nurse. Of importance for nursing, the results of the present study indicate that knowledge and experience are not sufficient to ensure expert practice; recognition of expertise by others is an important function of expertise acquisition.
Resumo:
The use of adaptive wing/aerofoil designs is being considered, as they are promising techniques in aeronautic/ aerospace since they can reduce aircraft emissions and improve aerodynamic performance of manned or unmanned aircraft. This paper investigates the robust design and optimization for one type of adaptive techniques: active flow control bump at transonic flow conditions on a natural laminar flow aerofoil. The concept of using shock control bump is to control supersonic flow on the suction/pressure side of natural laminar flow aerofoil that leads to delaying shock occurrence (weakening its strength) or boundary layer separation. Such an active flow control technique reduces total drag at transonic speeds due to reduction of wave drag. The location of boundary-layer transition can influence the position and structure of the supersonic shock on the suction/pressure side of aerofoil. The boundarylayer transition position is considered as an uncertainty design parameter in aerodynamic design due to the many factors, such as surface contamination or surface erosion. This paper studies the shock-control-bump shape design optimization using robust evolutionary algorithms with uncertainty in boundary-layer transition locations. The optimization method is based on a canonical evolution strategy and incorporates the concepts of hierarchical topology, parallel computing, and asynchronous evaluation. The use of adaptive wing/aerofoil designs is being considered, as they are promising techniques in aeronautic/ aerospace since they can reduce aircraft emissions and improve aerodynamic performance of manned or unmanned aircraft. This paper investigates the robust design and optimization for one type of adaptive techniques: active flow control bump at transonic flow conditions on a natural laminar flow aerofoil. The concept of using shock control bump is to control supersonic flow on the suction/pressure side of natural laminar flow aerofoil that leads to delaying shock occurrence (weakening its strength) or boundary-layer separation. Such an active flow control technique reduces total drag at transonic speeds due to reduction of wave drag. The location of boundary-layer transition can influence the position and structure of the supersonic shock on the suction/pressure side of aerofoil. The boundarylayer transition position is considered as an uncertainty design parameter in aerodynamic design due to the many factors, such as surface contamination or surface erosion. This paper studies the shock-control-bump shape design optimization using robust evolutionary algorithms with uncertainty in boundary-layer transition locations. The optimization method is based on a canonical evolution strategy and incorporates the concepts of hierarchical topology, parallel computing, and asynchronous evaluation. Two test cases are conducted: the first test assumes the boundary-layer transition position is at 45% of chord from the leading edge, and the second test considers robust design optimization for the shock control bump at the variability of boundary-layer transition positions. The numerical result shows that the optimization method coupled to uncertainty design techniques produces Pareto optimal shock-control-bump shapes, which have low sensitivity and high aerodynamic performance while having significant total drag reduction.
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
Facial expression recognition (FER) algorithms mainly focus on classification into a small discrete set of emotions or representation of emotions using facial action units (AUs). Dimensional representation of emotions as continuous values in an arousal-valence space is relatively less investigated. It is not fully known whether fusion of geometric and texture features will result in better dimensional representation of spontaneous emotions. Moreover, the performance of many previously proposed approaches to dimensional representation has not been evaluated thoroughly on publicly available databases. To address these limitations, this paper presents an evaluation framework for dimensional representation of spontaneous facial expressions using texture and geometric features. SIFT, Gabor and LBP features are extracted around facial fiducial points and fused with FAP distance features. The CFS algorithm is adopted for discriminative texture feature selection. Experimental results evaluated on the publicly accessible NVIE database demonstrate that fusion of texture and geometry does not lead to a much better performance than using texture alone, but does result in a significant performance improvement over geometry alone. LBP features perform the best when fused with geometric features. Distributions of arousal and valence for different emotions obtained via the feature extraction process are compared with those obtained from subjective ground truth values assigned by viewers. Predicted valence is found to have a more similar distribution to ground truth than arousal in terms of covariance or Bhattacharya distance, but it shows a greater distance between the means.
Resumo:
Objective: To examine whether health professionals who commonly deal with mental disorder are able to identify co occurring alcohol misuse in young people presenting with depression. Method: Between September 2006 and January 2007, a survey examining beliefs regarding appropriate interventions for mental disorder in youth was sent to 1710 psychiatrists, 2000 general practitioners (GPs), 1628 mental health nurses, and 2000 psychologists in Australia. Participants within each professional group were randomly given one of four vignettes describing a young person with a DSM-IV mental disorder. Herein is reported data from the depression and depression with alcohol misuse vignettes. Results: A total of 305 psychiatrists, 258 GPs, 292 mental health nurses and 375 psychologists completed one of the depression vignettes. A diagnosis of mood disorder was identified by at least 83.8% of professionals, with no significant differences noted between professional groups. Rates of reported co-occurring substance use disorders were substantially lower, particularly among older professionals and psychologists. Conclusions: GPs, psychologists and mental health professionals do not readily identify co-occurring alcohol misuse in young people with depression. Given the substantially negative impact of co-occurring disorders, it is imperative that health-care professionals are appropriately trained to detect such disorders promptly, to ensure young people have access to effective, early intervention.
Resumo:
A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants
Resumo:
An approach to pattern recognition using invariant parameters based on higher-order spectra is presented. In particular, bispectral invariants are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale- and amplification-invariant. A minimal set of these invariants is selected as the feature vector for pattern classification. Pattern recognition using higher-order spectral invariants is fast, suited for parallel implementation, and works for signals corrupted by Gaussian noise. The classification technique is shown to distinguish two similar but different bolts given their one-dimensional profiles
Resumo:
Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs
Resumo:
Characteristics of surveillance video generally include low resolution and poor quality due to environmental, storage and processing limitations. It is extremely difficult for computers and human operators to identify individuals from these videos. To overcome this problem, super-resolution can be used in conjunction with an automated face recognition system to enhance the spatial resolution of video frames containing the subject and narrow down the number of manual verifications performed by the human operator by presenting a list of most likely candidates from the database. As the super-resolution reconstruction process is ill-posed, visual artifacts are often generated as a result. These artifacts can be visually distracting to humans and/or affect machine recognition algorithms. While it is intuitive that higher resolution should lead to improved recognition accuracy, the effects of super-resolution and such artifacts on face recognition performance have not been systematically studied. This paper aims to address this gap while illustrating that super-resolution allows more accurate identification of individuals from low-resolution surveillance footage. The proposed optical flow-based super-resolution method is benchmarked against Baker et al.’s hallucination and Schultz et al.’s super-resolution techniques on images from the Terrascope and XM2VTS databases. Ground truth and interpolated images were also tested to provide a baseline for comparison. Results show that a suitable super-resolution system can improve the discriminability of surveillance video and enhance face recognition accuracy. The experiments also show that Schultz et al.’s method fails when dealing surveillance footage due to its assumption of rigid objects in the scene. The hallucination and optical flow-based methods performed comparably, with the optical flow-based method producing less visually distracting artifacts that interfered with human recognition.
Resumo:
This paper argues that teachers’ recognition of children’s cultural practices is an important positive step in helping socio-economically disadvantaged children engage with school literacies. Based on twenty-one longitudinal case studies of children’s literacy development over a three-year period, the authors demonstrate that when children’s knowledges and practices assembled in home and community spheres are treated as valuable material for school learning, children are more likely to invest in the work of acquiring school literacies. However they show also that whilst some children benefit greatly from being allowed to draw on their knowledge of popular culture, sports and the outdoors, other children’s interests may be ignored or excluded. Some differences in teachers’ valuing of home and community cultures appeared to relate to gender dimensions.
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
In automatic facial expression recognition, an increasing number of techniques had been proposed for in the literature that exploits the temporal nature of facial expressions. As all facial expressions are known to evolve over time, it is crucially important for a classifier to be capable of modelling their dynamics. We establish that the method of sparse representation (SR) classifiers proves to be a suitable candidate for this purpose, and subsequently propose a framework for expression dynamics to be efficiently incorporated into its current formulation. We additionally show that for the SR method to be applied effectively, then a certain threshold on image dimensionality must be enforced (unlike in facial recognition problems). Thirdly, we determined that recognition rates may be significantly influenced by the size of the projection matrix \Phi. To demonstrate these, a battery of experiments had been conducted on the CK+ dataset for the recognition of the seven prototypic expressions - anger, contempt, disgust, fear, happiness, sadness and surprise - and comparisons have been made between the proposed temporal-SR against the static-SR framework and state-of-the-art support vector machine.
Resumo:
Gait recognition approaches continue to struggle with challenges including view-invariance, low-resolution data, robustness to unconstrained environments, and fluctuating gait patterns due to subjects carrying goods or wearing different clothes. Although computationally expensive, model based techniques offer promise over appearance based techniques for these challenges as they gather gait features and interpret gait dynamics in skeleton form. In this paper, we propose a fast 3D ellipsoidal-based gait recognition algorithm using a 3D voxel model derived from multi-view silhouette images. This approach directly solves the limitations of view dependency and self-occlusion in existing ellipse fitting model-based approaches. Voxel models are segmented into four components (left and right legs, above and below the knee), and ellipsoids are fitted to each region using eigenvalue decomposition. Features derived from the ellipsoid parameters are modeled using a Fourier representation to retain the temporal dynamic pattern for classification. We demonstrate the proposed approach using the CMU MoBo database and show that an improvement of 15-20% can be achieved over a 2D ellipse fitting baseline.