Biblioteca Digital

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Veja mais

New methods in speech enhancement, and modelling, with applications to low bit rate speech coding

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Audio-visual speech processing

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Preferences for visual attributes in the process of selection and location of street trees in the Brisbane metropolitan area

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Conceptions of senior visual art programs in a rural remote high school

Relevância:

20.00% 20.00%

Publicador:

Veja mais

The depiction of a character's mental state through visual design in dramatic film (The Machine)

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Visual arts, technology and education : how can teaching and learning in high school visual arts classrooms be enriched by the use of computer technology?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Whilst a variety of studies has appeared over the last decade addressing the gap between the potential promised by computers and the reality experienced in the classroom by teachers and students, few have specifically addressed the situation as it pertains to the visual arts classroom. The aim of this study was to explore the reality of the classroom use of computers for three visual arts highschool teachers and determine how computer technology might enrich visual arts teaching and learning. An action research approach was employed to enable the researcher to understand the situation from the teachers' points of view while contributing to their professional practice. The wider social context surrounding this study is characterised by an increase in visual communications brought about by rapid advances in computer technology. The powerful combination of visual imagery and computer technology is illustrated by continuing developments in the print, film and television industries. In particular, the recent growth of interactive multimedia epitomises this combination and is significant to this study as it represents a new form of publishing of great interest to educators and artists alike. In this social context, visual arts education has a significant role to play. By cultivating a critical awareness of the implications of technology use and promoting a creative approach to the application of computer technology within the visual arts, visual arts education is in a position to provide an essential service to students who will leave high school to participate in a visual information age as both consumers and producers.

Veja mais