914 resultados para robust speech recognition
Resumo:
This paper demonstrates the application of a robust form of pose estimation and scene reconstruction using data from camera images. We demonstrate results that suggest the ability of the algorithm to rival methods of RANSAC based pose estimation polished by bundle adjustment in terms of solution robustness, speed and accuracy, even when given poor initialisations. Our simulated results show the behaviour of the algorithm in a number of novel simulated scenarios reflective of real world cases that show the ability of the algorithm to handle large observation noise and difficult reconstruction scenes. These results have a number of implications for the vision and robotics community, and show that the application of visual motion estimation on robotic platforms in an online fashion is approaching real-world feasibility.
Resumo:
A special transmit polarization signalling scheme is presented to alleviate the power reduction as a result of polarization mismatch from random antenna orientations. This is particularly useful for hand held mobile terminals typically equipped with only a single linearly polarized antenna, since the average signal power is desensitized against receiver orientations. Numerical simulations also show adequate robustness against incorrect channel estimations.
Resumo:
An autonomous underwater vehicle (AUV) is expected to operate in an ocean in the presence of poorly known disturbance forces and moments. The uncertainties of the environment makes it difficult to apply open-loop control scheme for the motion planning of the vehicle. The objective of this paper is to develop a robust feedback trajectory tracking control scheme for an AUV that can track a prescribed trajectory amidst such disturbances. We solve a general problem of feedback trajectory tracking of an AUV in SE(3). The feedback control scheme is derived using Lyapunov-type analysis. The results obtained from numerical simulations confirm the asymptotic tracking properties of the feedback control law. We apply the feedback control scheme to different mission scenarios, with the disturbances being initial errors in the state of the AUV.
Resumo:
This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verification: Application task of EVALITA 2009. This competitive submission consisted of a score-level fusion of three component systems; a joint-factor analysis GMM system and two SVM systems using GLDS and GMM supervector kernels. Development evaluation and post-submission results are presented in this study, demonstrating the effectiveness of this fused system approach. This study highlights the challenges associated with system calibration from limited development data and that mismatch between training and testing conditions continues to be a major source of error in speaker verification technology.
Resumo:
Two archaeal Holliday junction resolving enzymes, Holliday junction cleavage (Hjc) and Holliday junction endonuclease (Hje), have been characterized. Both are members of a nuclease superfamily that includes the type II restriction enzymes, although their DNA cleaving activity is highly specific for four-way junction structure and not nucleic acid sequence. Despite 28% sequence identity, Hje and Hjc cleave junctions with distinct cutting patterns—they cut different strands of a four-way junction, at different distances from the junction centre. We report the high-resolution crystal structure of Hje from Sulfolobus solfataricus. The structure provides a basis to explain the differences in substrate specificity of Hje and Hjc, which result from changes in dimer organization, and suggests a viral origin for the Hje gene. Structural and biochemical data support the modelling of an Hje:DNA junction complex, highlighting a flexible loop that interacts intimately with the junction centre. A highly conserved serine residue on this loop is shown to be essential for the enzyme's activity, suggesting a novel variation of the nuclease active site. The loop may act as a conformational switch, ensuring that the active site is completed only on binding a four-way junction, thus explaining the exquisite specificity of these enzymes.
Resumo:
Segmentation of novel or dynamic objects in a scene, often referred to as background sub- traction or foreground segmentation, is critical for robust high level computer vision applica- tions such as object tracking, object classifca- tion and recognition. However, automatic real- time segmentation for robotics still poses chal- lenges including global illumination changes, shadows, inter-re ections, colour similarity of foreground to background, and cluttered back- grounds. This paper introduces depth cues provided by structure from motion (SFM) for interactive segmentation to alleviate some of these challenges. In this paper, two prevailing interactive segmentation algorithms are com- pared; Lazysnapping [Li et al., 2004] and Grab- cut [Rother et al., 2004], both based on graph- cut optimisation [Boykov and Jolly, 2001]. The algorithms are extended to include depth cues rather than colour only as in the original pa- pers. Results show interactive segmentation based on colour and depth cues enhances the performance of segmentation with a lower er- ror with respect to ground truth.
Resumo:
Cell based therapies for bone regeneration are an exciting emerging technology, but the availability of osteogenic cells is limited and an ideal cell source has not been identified. Amniotic fluid-derived stem (AFS) cells and bone-marrow derived mesenchymal stem cells (MSCs) were compared to determine their osteogenic differentiation capacity in both 2D and 3D environments. In 2D culture, the AFS cells produced more mineralized matrix but delayed peaks in osteogenic markers. Cells were also cultured on 3D scaffolds constructed of poly-e-caprolactone for 15 weeks. MSCs differentiated more quickly than AFS cells on 3D scaffolds, but mineralized matrix production slowed considerably after 5 weeks. In contrast, the rate of AFS cell mineralization continued to increase out to 15 weeks, at which time AFS constructs contained 5-fold more mineralized matrix than MSC constructs. Therefore, cell source should be taken into consideration when used for cell therapy, as the MSCs would be a good choice for immediate matrix production, but the AFS cells would continue robust mineralization for an extended period of time. This study demonstrates that stem cell source can dramatically influence the magnitude and rate of osteogenic differentiation in vitro.
Resumo:
This paper proposes the use of the Bayes Factor as a distance metric for speaker segmentation within a speaker diarization system. The proposed approach uses a pair of constant sized, sliding windows to compute the value of the Bayes Factor between the adjacent windows over the entire audio. Results obtained on the 2002 Rich Transcription Evaluation dataset show an improved segmentation performance compared to previous approaches reported in literature using the Generalized Likelihood Ratio. When applied in a speaker diarization system, this approach results in a 5.1% relative improvement in the overall Diarization Error Rate compared to the baseline.
Resumo:
Video surveillance technology, based on Closed Circuit Television (CCTV) cameras, is one of the fastest growing markets in the field of security technologies. However, the existing video surveillance systems are still not at a stage where they can be used for crime prevention. The systems rely heavily on human observers and are therefore limited by factors such as fatigue and monitoring capabilities over long periods of time. To overcome this limitation, it is necessary to have “intelligent” processes which are able to highlight the salient data and filter out normal conditions that do not pose a threat to security. In order to create such intelligent systems, an understanding of human behaviour, specifically, suspicious behaviour is required. One of the challenges in achieving this is that human behaviour can only be understood correctly in the context in which it appears. Although context has been exploited in the general computer vision domain, it has not been widely used in the automatic suspicious behaviour detection domain. So, it is essential that context has to be formulated, stored and used by the system in order to understand human behaviour. Finally, since surveillance systems could be modeled as largescale data stream systems, it is difficult to have a complete knowledge base. In this case, the systems need to not only continuously update their knowledge but also be able to retrieve the extracted information which is related to the given context. To address these issues, a context-based approach for detecting suspicious behaviour is proposed. In this approach, contextual information is exploited in order to make a better detection. The proposed approach utilises a data stream clustering algorithm in order to discover the behaviour classes and their frequency of occurrences from the incoming behaviour instances. Contextual information is then used in addition to the above information to detect suspicious behaviour. The proposed approach is able to detect observed, unobserved and contextual suspicious behaviour. Two case studies using video feeds taken from CAVIAR dataset and Z-block building, Queensland University of Technology are presented in order to test the proposed approach. From these experiments, it is shown that by using information about context, the proposed system is able to make a more accurate detection, especially those behaviours which are only suspicious in some contexts while being normal in the others. Moreover, this information give critical feedback to the system designers to refine the system. Finally, the proposed modified Clustream algorithm enables the system to both continuously update the system’s knowledge and to effectively retrieve the information learned in a given context. The outcomes from this research are: (a) A context-based framework for automatic detecting suspicious behaviour which can be used by an intelligent video surveillance in making decisions; (b) A modified Clustream data stream clustering algorithm which continuously updates the system knowledge and is able to retrieve contextually related information effectively; and (c) An update-describe approach which extends the capability of the existing human local motion features called interest points based features to the data stream environment.
Resumo:
Increasing awareness of the benefits of stimulating entrepreneurial behaviour in small and medium enterprises has fostered strong interest in innovation programs. Recently many western countries have invested in design innovation for better firm performance. This research presents some early findings from a study of companies which participated in an holistic approach to design innovation, where the outcomes include better business performance and better market positioning in global markets. Preliminary findings from in-depth semi-structured interviews indicate the importance of firm openness to new ways of working and developing new processes of strategic entrepreneurship. Implications for theory and practice are discussed.
Resumo:
Uncooperative iris identification systems at a distance suffer from poor resolution of the captured iris images, which significantly degrades iris recognition performance. Superresolution techniques have been employed to enhance the resolution of iris images and improve the recognition performance. However, all existing super-resolution approaches proposed for the iris biometric super-resolve pixel intensity values. This paper considers transferring super-resolution of iris images from the intensity domain to the feature domain. By directly super-resolving only the features essential for recognition, and by incorporating domain specific information from iris models, improved recognition performance compared to pixel domain super-resolution can be achieved. This is the first paper to investigate the possibility of feature domain super-resolution for iris recognition, and experiments confirm the validity of the proposed approach.
Resumo:
The Electrocardiogram (ECG) is an important bio-signal representing the sum total of millions of cardiac cell depolarization potentials. It contains important insight into the state of health and nature of the disease afflicting the heart. Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. The HRV signal can be used as a base signal to observe the heart's functioning. These signals are non-linear and non-stationary in nature. So, higher order spectral (HOS) analysis, which is more suitable for non-linear systems and is robust to noise, was used. An automated intelligent system for the identification of cardiac health is very useful in healthcare technology. In this work, we have extracted seven features from the heart rate signals using HOS and fed them to a support vector machine (SVM) for classification. Our performance evaluation protocol uses 330 subjects consisting of five different kinds of cardiac disease conditions. We demonstrate a sensitivity of 90% for the classifier with a specificity of 87.93%. Our system is ready to run on larger data sets.
Resumo:
Stem cells have attracted tremendous interest in recent times due to their promise in providing innovative new treatments for a great range of currently debilitating diseases. This is due to their potential ability to regenerate and repair damaged tissue, and hence restore lost body function, in a manner beyond the body's usual healing process. Bone marrow-derived mesenchymal stem cells or bone marrow stromal cells are one type of adult stem cells that are of particular interest. Since they are derived from a living human adult donor, they do not have the ethical issues associated with the use of human embryonic stem cells. They are also able to be taken from a patient or other donors with relative ease and then grown readily in the laboratory for clinical application. Despite the attractive properties of bone marrow stromal cells, there is presently no quick and easy way to determine the quality of a sample of such cells. Presently, a sample must be grown for weeks and subject to various time-consuming assays, under the direction of an expert cell biologist, to determine whether it will be useful. Hence there is a great need for innovative new ways to assess the quality of cell cultures for research and potential clinical application. The research presented in this thesis investigates the use of computerised image processing and pattern recognition techniques to provide a quicker and simpler method for the quality assessment of bone marrow stromal cell cultures. In particular, aim of this work is to find out whether it is possible, through the use of image processing and pattern recognition techniques, to predict the growth potential of a culture of human bone marrow stromal cells at early stages, before it is readily apparent to a human observer. With the above aim in mind, a computerised system was developed to classify the quality of bone marrow stromal cell cultures based on phase contrast microscopy images. Our system was trained and tested on mixed images of both healthy and unhealthy bone marrow stromal cell samples taken from three different patients. This system, when presented with 44 previously unseen bone marrow stromal cell culture images, outperformed human experts in the ability to correctly classify healthy and unhealthy cultures. The system correctly classified the health status of an image 88% of the time compared to an average of 72% of the time for human experts. Extensive training and testing of the system on a set of 139 normal sized images and 567 smaller image tiles showed an average performance of 86% and 85% correct classifications, respectively. The contributions of this thesis include demonstrating the applicability and potential of computerised image processing and pattern recognition techniques to the task of quality assessment of bone marrow stromal cell cultures. As part of this system, an image normalisation method has been suggested and a new segmentation algorithm has been developed for locating cell regions of irregularly shaped cells in phase contrast images. Importantly, we have validated the efficacy of both the normalisation and segmentation method, by demonstrating that both methods quantitatively improve the classification performance of subsequent pattern recognition algorithms, in discriminating between cell cultures of differing health status. We have shown that the quality of a cell culture of bone marrow stromal cells may be assessed without the need to either segment individual cells or to use time-lapse imaging. Finally, we have proposed a set of features, that when extracted from the cell regions of segmented input images, can be used to train current state of the art pattern recognition systems to predict the quality of bone marrow stromal cell cultures earlier and more consistently than human experts.
Resumo:
Facial expression is an important channel for human communication and can be applied in many real applications. One critical step for facial expression recognition (FER) is to accurately extract emotional features. Current approaches on FER in static images have not fully considered and utilized the features of facial element and muscle movements, which represent static and dynamic, as well as geometric and appearance characteristics of facial expressions. This paper proposes an approach to solve this limitation using ‘salient’ distance features, which are obtained by extracting patch-based 3D Gabor features, selecting the ‘salient’ patches, and performing patch matching operations. The experimental results demonstrate high correct recognition rate (CRR), significant performance improvements due to the consideration of facial element and muscle movements, promising results under face registration errors, and fast processing time. The comparison with the state-of-the-art performance confirms that the proposed approach achieves the highest CRR on the JAFFE database and is among the top performers on the Cohn-Kanade (CK) database.