941 resultados para automatic speech recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure product quality and reliability. The solder joint inspection problem is more challenging than many other visual inspections because of the variability in the appearance of solder joints. Although many research works and various techniques have been developed to classify defect in solder joints, these methods have complex systems of illumination for image acquisition and complicated classification algorithms. An important stage of the analysis is to select the right method for the classification. Better inspection technologies are needed to fill the gap between available inspection capabilities and industry systems. This dissertation aims to provide a solution that can overcome some of the limitations of current inspection techniques. This research proposes two inspection steps for automatic solder joint classification system. The “front-end” inspection system includes illumination normalisation, localization and segmentation. The illumination normalisation approach can effectively and efficiently eliminate the effect of uneven illumination while keeping the properties of the processed image. The “back-end” inspection involves the classification of solder joints by using Log Gabor filter and classifier fusion. Five different levels of solder quality with respect to the amount of solder paste have been defined. Log Gabor filter has been demonstrated to achieve high recognition rates and is resistant to misalignment. Further testing demonstrates the advantage of Log Gabor filter over both Discrete Wavelet Transform and Discrete Cosine Transform. Classifier score fusion is analysed for improving recognition rate. Experimental results demonstrate that the proposed system improves performance and robustness in terms of classification rates. This proposed system does not need any special illumination system, and the images are acquired by an ordinary digital camera. In fact, the choice of suitable features allows one to overcome the problem given by the use of non complex illumination systems. The new system proposed in this research can be incorporated in the development of an automated non-contact, non-destructive and low cost solder joint quality inspection system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Probabilistic robotics, most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainly to accompany observations of the environment. This paper describes how uncertainly can be characterised for a vision system that locates coloured landmark in a typical laboratory environment. The paper describes a model of the uncertainly in segmentation, the internal camera model and the mounting of the camera on the robot. It =plains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainly model,

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The task addressed in this thesis is the automatic alignment of an ensemble of misaligned images in an unsupervised manner. This application is especially useful in computer vision applications where annotations of the shape of an object of interest present in a collection of images is required. Performing this task manually is a slow, tedious, expensive and error prone process which hinders the progress of research laboratories and businesses. Most recently, the unsupervised removal of geometric variation present in a collection of images has been referred to as congealing based on the seminal work of Learned-Miller [21]. The only assumption made in congealing is that the parametric nature of the misalignment is known a priori (e.g. translation, similarity, a�ne, etc) and that the object of interest is guaranteed to be present in each image. The capability to congeal an ensemble of misaligned images stemming from the same object class has numerous applications in object recognition, detection and tracking. This thesis concerns itself with the construction of a congealing algorithm titled, least-squares congealing, which is inspired by the well known image to image alignment algorithm developed by Lucas and Kanade [24]. The algorithm is shown to have superior performance characteristics when compared to previously established methods: canonical congealing by Learned-Miller [21] and stochastic congealing by Z�ollei [39].

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Uncooperative iris identification systems at a distance and on the move often suffer from poor resolution and poor focus of the captured iris images. The lack of pixel resolution and well-focused images significantly degrades the iris recognition performance. This paper proposes a new approach to incorporate the focus score into a reconstruction-based super-resolution process to generate a high resolution iris image from a low resolution and focus inconsistent video sequence of an eye. A reconstruction-based technique, which can incorporate middle and high frequency components from multiple low resolution frames into one desired super-resolved frame without introducing false high frequency components, is used. A new focus assessment approach is proposed for uncooperative iris at a distance and on the move to improve performance for variations in lighting, size and occlusion. A novel fusion scheme is then proposed to incorporate the proposed focus score into the super-resolution process. The experiments conducted on the The Multiple Biometric Grand Challenge portal database shows that our proposed approach achieves an EER of 2.1%, outperforming the existing state-of-the-art averaging signal-level fusion approach by 19.2% and the robust mean super-resolution approach by 8.7%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a microphone array beamforming approach to blind speech separation. Unlike previous beamforming approaches, our system does not require a-priori knowledge of the microphone placement and speaker location, making the system directly comparable other blind source separation methods which require no prior knowledge of recording conditions. Microphone location is automatically estimated using an assumed noise field model, and speaker locations are estimated using cross correlation based methods. The system is evaluated on the data provided for the PASCAL Speech Separation Challenge 2 (SSC2), achieving a word error rate of 58% on the evaluation set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Type unions, pointer variables and function pointers are a long standing source of subtle security bugs in C program code. Their use can lead to hard-to-diagnose crashes or exploitable vulnerabilities that allow an attacker to attain privileged access over classified data. This paper describes an automatable framework for detecting such weaknesses in C programs statically, where possible, and for generating assertions that will detect them dynamically, in other cases. Exclusively based on analysis of the source code, it identifies required assertions using a type inference system supported by a custom made symbol table. In our preliminary findings, our type system was able to infer the correct type of unions in different scopes, without manual code annotations or rewriting. Whenever an evaluation is not possible or is difficult to resolve, appropriate runtime assertions are formed and inserted into the source code. The approach is demonstrated via a prototype C analysis tool.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verification: Application task of EVALITA 2009. This competitive submission consisted of a score-level fusion of three component systems; a joint-factor analysis GMM system and two SVM systems using GLDS and GMM supervector kernels. Development evaluation and post-submission results are presented in this study, demonstrating the effectiveness of this fused system approach. This study highlights the challenges associated with system calibration from limited development data and that mismatch between training and testing conditions continues to be a major source of error in speaker verification technology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In automatic facial expression detection, very accurate registration is desired which can be achieved via a deformable model approach where a dense mesh of 60-70 points on the face is used, such as an active appearance model (AAM). However, for applications where manually labeling frames is prohibitive, AAMs do not work well as they do not generalize well to unseen subjects. As such, a more coarse approach is taken for person-independent facial expression detection, where just a couple of key features (such as face and eyes) are tracked using a Viola-Jones type approach. The tracked image is normally post-processed to encode for shift and illumination invariance using a linear bank of filters. Recently, it was shown that this preprocessing step is of no benefit when close to ideal registration has been obtained. In this paper, we present a system based on the Constrained Local Model (CLM) which is a generic or person-independent face alignment algorithm which gains high accuracy. We show these results against the LBP feature extraction on the CK+ and GEMEP datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a method for calculating the in-bucket payload volume on a dragline for the purpose of estimating the material’s bulk density in real-time. Knowledge of the bulk density can provide instant feedback to mine planning and scheduling to improve blasting and in turn provide a more uniform bulk density across the excavation site. Furthermore costs and emissions in dragline operation, maintenance and downstream material processing can be reduced. The main challenge is to determine an accurate position and orientation of the bucket with the constraint of real-time performance. The proposed solution uses a range bearing and tilt sensor to locate and scan the bucket between the lift and dump stages of the dragline cycle. Various scanning strategies are investigated for their benefits in this real-time application. The bucket is segmented from the scene using cluster analysis while the pose of the bucket is calculated using the iterative closest point (ICP) algorithm. Payload points are segmented from the bucket by a fixed distance neighbour clustering method to preserve boundary points and exclude low density clusters introduced by overhead chains and the spreader bar. A height grid is then used to represent the payload from which the volume can be calculated by summing over the grid cells. We show volume calculated on a scaled system with an accuracy of greater than 95 per cent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a robust place recognition algorithm for mobile robots. The framework proposed combines nonlinear dimensionality reduction, nonlinear regression under noise, and variational Bayesian learning to create consistent probabilistic representations of places from images. These generative models are learnt from a few images and used for multi-class place recognition where classification is computed from a set of feature-vectors. Recognition can be performed in near real-time and accounts for complexity such as changes in illumination, occlusions and blurring. The algorithm was tested with a mobile robot in indoor and outdoor environments with sequences of 1579 and 3820 images respectively. This framework has several potential applications such as map building, autonomous navigation, search-rescue tasks and context recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two archaeal Holliday junction resolving enzymes, Holliday junction cleavage (Hjc) and Holliday junction endonuclease (Hje), have been characterized. Both are members of a nuclease superfamily that includes the type II restriction enzymes, although their DNA cleaving activity is highly specific for four-way junction structure and not nucleic acid sequence. Despite 28% sequence identity, Hje and Hjc cleave junctions with distinct cutting patterns—they cut different strands of a four-way junction, at different distances from the junction centre. We report the high-resolution crystal structure of Hje from Sulfolobus solfataricus. The structure provides a basis to explain the differences in substrate specificity of Hje and Hjc, which result from changes in dimer organization, and suggests a viral origin for the Hje gene. Structural and biochemical data support the modelling of an Hje:DNA junction complex, highlighting a flexible loop that interacts intimately with the junction centre. A highly conserved serine residue on this loop is shown to be essential for the enzyme's activity, suggesting a novel variation of the nuclease active site. The loop may act as a conformational switch, ensuring that the active site is completed only on binding a four-way junction, thus explaining the exquisite specificity of these enzymes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Segmentation of novel or dynamic objects in a scene, often referred to as background sub- traction or foreground segmentation, is critical for robust high level computer vision applica- tions such as object tracking, object classifca- tion and recognition. However, automatic real- time segmentation for robotics still poses chal- lenges including global illumination changes, shadows, inter-re ections, colour similarity of foreground to background, and cluttered back- grounds. This paper introduces depth cues provided by structure from motion (SFM) for interactive segmentation to alleviate some of these challenges. In this paper, two prevailing interactive segmentation algorithms are com- pared; Lazysnapping [Li et al., 2004] and Grab- cut [Rother et al., 2004], both based on graph- cut optimisation [Boykov and Jolly, 2001]. The algorithms are extended to include depth cues rather than colour only as in the original pa- pers. Results show interactive segmentation based on colour and depth cues enhances the performance of segmentation with a lower er- ror with respect to ground truth.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Occlusion is a big challenge for facial expression recognition (FER) in real-world situations. Previous FER efforts to address occlusion suffer from loss of appearance features and are largely limited to a few occlusion types and single testing strategy. This paper presents a robust approach for FER in occluded images and addresses these issues. A set of Gabor based templates is extracted from images in the gallery using a Monte Carlo algorithm. These templates are converted into distance features using template matching. The resulting feature vectors are robust to occlusion. Occluded eyes and mouth regions and randomly places occlusion patches are used for testing. Two testing strategies analyze the effects of these occlusions on the overall recognition performance as well as each facial expression. Experimental results on the Cohn-Kanade database confirm the high robustness of our approach and provide useful insights about the effects of occlusion on FER. Performance is also compared with previous approaches.