199 resultados para Speech and Audio Research Laboratory
Resumo:
I must admit that I approached the European Union-supported educational research 1995-2003: Briefing papers for policy makers with a sense of trepidation. As a researcher who defines himself as socially critical, I wondered about the dynamics of a policy document that was published by the bureaucracy that has, in some form, a vested interest in the structure and operation of education in its various guises. In turning my attention to this review, I decided to focus my attention on the third guiding question that argues education and training "are strongly interconnected with concerns that include citizenship and democratic participation, inequalities and social justice, cultural diversity and quality of life" (Millei, 2005). The Briefing Papers include recommendations on democracy and citizenship, social exclusion and equality, gender and dealing with mental illness in schools...
Resumo:
Public submission # 247 to the McKeon Review. The submission addresses the terms of reference on: How can we optimise translation of health and medical research into better health and wellbeing? (Terms of Reference 4, 8, 9, 10 and 11)
Resumo:
We propose a conceptual model based on person–environment interaction, job performance, and motivational theories to structure a multilevel review of the employee green behavior (EGB) literature and agenda for future research. We differentiate between required EGB prescribed by the organization and voluntary EGB performed at the employees’ discretion. The review investigates institutional-, organizational-, leader-, team-, and employee-level antecedents and outcomes of EGB and factors that mediate and moderate these relationships. We offer suggestions to facilitate the development of the field, and call for future research to adopt a multilevel perspective and to investigate the outcomes of EGB.
Resumo:
There have been sweeping changes in policy and practice on violence against intimate partners over the past several decades. New laws, policies, programs, and research funding have shaped the literature on this topic as well as the contours of violence itself. A substantial portion of the contemporary research literature is devoted to the policies and interventions that affect intimate partner violence. This chapter will first review key policy changes that have shaped interventions in violence against intimate partners. Second, it will map major areas of research on policy and intervention in violence and abuse. Finally, it will propose directions for future research.
Resumo:
An overview of the Centre for Accident Research and Road Safety - Queensland and its main areas of research. Background to CARRS-Q Established in 1996 as joint initiative of: - Queensland University of Technology - Queensland Motor Accident Insurance Commission - Based in the Faculty of Health - Our Vision is for: A safer world in which injury-related harm is uncommon and unacceptable
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.
Resumo:
For several reasons, the Fourier phase domain is less favored than the magnitude domain in signal processing and modeling of speech. To correctly analyze the phase, several factors must be considered and compensated, including the effect of the step size, windowing function and other processing parameters. Building on a review of these factors, this paper investigates a spectral representation based on the Instantaneous Frequency Deviation, but in which the step size between processing frames is used in calculating phase changes, rather than the traditional single sample interval. Reflecting these longer intervals, the term delta-phase spectrum is used to distinguish this from instantaneous derivatives. Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum (termed Mel-Frequency delta-phase features) can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks. Further, it is shown that the fusion of the magnitude and phase representations yields performance benefits over either in isolation.
Resumo:
Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.
Resumo:
Background Recent initiatives within an Australia public healthcare service have seen a focus on increasing the research capacity of their workforce. One of the key initiatives involves encouraging clinicians to be research generators rather than solely research consumers. As a result, baseline data of current research capacity are essential to determine whether initiatives encouraging clinicians to undertake research have been effective. Speech pathologists have previously been shown to be interested in conducting research within their clinical role; therefore they are well positioned to benefit from such initiatives. The present study examined the current research interest, confidence and experience of speech language pathologists (SLPs) in a public healthcare workforce, as well as factors that predicted clinician research engagement. Methods Data were collected via an online survey emailed to an estimated 330 SLPs working within Queensland, Australia. The survey consisted of 30 questions relating to current levels of interest, confidence and experience performing specific research tasks, as well as how frequently SLPs had performed these tasks in the last 5 years. Results Although 158 SLPs responded to the survey, complete data were available for only 137. Respondents were more confident and experienced with basic research tasks (e.g., finding literature) and less confident and experienced with complex research tasks (e.g., analysing and interpreting results, publishing results). For most tasks, SLPs displayed higher levels of interest in the task than confidence and experience. Research engagement was predicted by highest qualification obtained, current job classification level and overall interest in research. Conclusions Respondents generally reported levels of interest in research higher than their confidence and experience, with many respondents reporting limited experience in most research tasks. Therefore SLPs have potential to benefit from research capacity building activities to increase their research skills in order to meet organisational research engagement objectives. However, these findings must be interpreted with the caveats that a relatively low response rate occurred and participants were recruited from a single state-wide health service, and therefore may not be representative of the wider SLP workforce.
Resumo:
Teachers' failure to utilise MBL activities more widely may be due to not recognising their capacity to transform the nature of laboratory activities to be more consistent with contemporary constructivist theories of learning. This research aimed to increase understanding of how MBL activities specifically designed to be consistent with a constructivist theory of learning support or constrain student construction of understanding. The first author conducted the research with his Year 11 physics class of 29 students. Dyads completed nine tasks relating to kinematics using a Predict-Observe-Explain format. Data sources included video and audio recordings of students and teacher during four 70-minute sessions, students' display graphs and written notes, semi-structured student interviews, and the teacher's journal. The study identifies the actors and describes the patterns of interactions in the MBL. Analysis of students' discourse and actions identified many instances where students' initial understanding of kinematics were mediated in multiple ways. Students invented numerous techniques for manipulating data in the service of their emerging understanding. The findings are presented as eight assertions. Recommendations are made for developing pedagogical strategies incorporating MBL activities which will likely catalyse student construction of understanding.
Resumo:
The School of Electrical and Electronic Systems Engineering of Queensland University of Technology (like many other universities around the world) has recognised the importance of complementing the teaching of signal processing with computer based experiments. A laboratory has been developed to provide a "hands-on" approach to the teaching of signal processing techniques. The motivation for the development of this laboratory was the cliche "What I hear I remember but what I do I understand." The laboratory has been named as the "Signal Computing and Real-time DSP Laboratory" and provides practical training to approximately 150 final year undergraduate students each year. The paper describes the novel features of the laboratory, techniques used in the laboratory based teaching, interesting aspects of the experiments that have been developed and student evaluation of the teaching techniques