988 resultados para audio processing
Resumo:
Sigma-delta modulated systems have a number of very appealing properties and are, therefore, heavily used in analog to digital converters, amplifiers, and modulators. This paper presents new results which indicate that they may also have significant potential for general purpose arithmetic processing.
Resumo:
In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.
Resumo:
To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.
Resumo:
Stem cells have attracted tremendous interest in recent times due to their promise in providing innovative new treatments for a great range of currently debilitating diseases. This is due to their potential ability to regenerate and repair damaged tissue, and hence restore lost body function, in a manner beyond the body's usual healing process. Bone marrow-derived mesenchymal stem cells or bone marrow stromal cells are one type of adult stem cells that are of particular interest. Since they are derived from a living human adult donor, they do not have the ethical issues associated with the use of human embryonic stem cells. They are also able to be taken from a patient or other donors with relative ease and then grown readily in the laboratory for clinical application. Despite the attractive properties of bone marrow stromal cells, there is presently no quick and easy way to determine the quality of a sample of such cells. Presently, a sample must be grown for weeks and subject to various time-consuming assays, under the direction of an expert cell biologist, to determine whether it will be useful. Hence there is a great need for innovative new ways to assess the quality of cell cultures for research and potential clinical application. The research presented in this thesis investigates the use of computerised image processing and pattern recognition techniques to provide a quicker and simpler method for the quality assessment of bone marrow stromal cell cultures. In particular, aim of this work is to find out whether it is possible, through the use of image processing and pattern recognition techniques, to predict the growth potential of a culture of human bone marrow stromal cells at early stages, before it is readily apparent to a human observer. With the above aim in mind, a computerised system was developed to classify the quality of bone marrow stromal cell cultures based on phase contrast microscopy images. Our system was trained and tested on mixed images of both healthy and unhealthy bone marrow stromal cell samples taken from three different patients. This system, when presented with 44 previously unseen bone marrow stromal cell culture images, outperformed human experts in the ability to correctly classify healthy and unhealthy cultures. The system correctly classified the health status of an image 88% of the time compared to an average of 72% of the time for human experts. Extensive training and testing of the system on a set of 139 normal sized images and 567 smaller image tiles showed an average performance of 86% and 85% correct classifications, respectively. The contributions of this thesis include demonstrating the applicability and potential of computerised image processing and pattern recognition techniques to the task of quality assessment of bone marrow stromal cell cultures. As part of this system, an image normalisation method has been suggested and a new segmentation algorithm has been developed for locating cell regions of irregularly shaped cells in phase contrast images. Importantly, we have validated the efficacy of both the normalisation and segmentation method, by demonstrating that both methods quantitatively improve the classification performance of subsequent pattern recognition algorithms, in discriminating between cell cultures of differing health status. We have shown that the quality of a cell culture of bone marrow stromal cells may be assessed without the need to either segment individual cells or to use time-lapse imaging. Finally, we have proposed a set of features, that when extracted from the cell regions of segmented input images, can be used to train current state of the art pattern recognition systems to predict the quality of bone marrow stromal cell cultures earlier and more consistently than human experts.
Resumo:
This paper describes the design and implementation of a unique undergraduate program in signal processing at the Queensland University of Technology (QUT). The criteria that influenced the choice of the subjects and the laboratories developed to support them are presented. A recently established Signal Processing Research Centre (SPRC) has played an important role in the development of the signal processing teaching program. The SPRC also provides training opportunities for postgraduate studies and research.
Resumo:
The School of Electrical and Electronic Systems Engineering at Queensland University of Technology, Brisbane, Australia (QUT), offers three bachelor degree courses in electrical and computer engineering. In all its courses there is a strong emphasis on signal processing. A newly established Signal Processing Research Centre (SPRC) has played an important role in the development of the signal processing units in these courses. This paper describes the unique design of the undergraduate program in signal processing at QUT, the laboratories developed to support it, and the criteria that influenced the design.
Resumo:
The School of Electrical and Electronic Systems Engineering of Queensland University of Technology (like many other universities around the world) has recognised the importance of complementing the teaching of signal processing with computer based experiments. A laboratory has been developed to provide a "hands-on" approach to the teaching of signal processing techniques. The motivation for the development of this laboratory was the cliche "What I hear I remember but what I do I understand." The laboratory has been named as the "Signal Computing and Real-time DSP Laboratory" and provides practical training to approximately 150 final year undergraduate students each year. The paper describes the novel features of the laboratory, techniques used in the laboratory based teaching, interesting aspects of the experiments that have been developed and student evaluation of the teaching techniques
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
We have developed digital image registration program for a MC 68000 based fundus image processing system (FIPS). FIPS not only is capable of executing typical image processing algorithms in spatial as well as Fourier domain, the execution time for many operations has been made much quicker by using a hybrid of "C", Fortran and MC6000 assembly languages.