Biblioteca Digital

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Veja mais

Media training in an era of commodified truth

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Introducing crop simulation technology using soft systems methodology : some issues in agricultural communication

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Moving the actor : towards an holistic approach to training and devising for performance

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Languaging the actor : an examination of the terminology used in actor training

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Throughout the twentieth century increased interest in the training of actors resulted in the emergence of a plethora of acting theories and innovative theatrical movements in Europe, the UK and the USA. The individuals or groups involved with the formulation of these theories and movements developed specific terminologies, or languages of acting, in an attempt to clearly articulate the nature and the practice of acting according to their particular pedagogy or theatrical aesthetic. Now at the dawning of the twenty-first century, Australia boasts quite a number of schools and university courses professing to train actors. This research aims to discover the language used in actor training on the east coast of Australia today. Using interviews with staff of the National Institute of Dramatic Art, the Victorian College of the Arts, and the Queensland University of Technology as the primary source of data, a constructivist grounded theory has emerged to assess the influence of last century‟s theatrical theorists and practitioners on Australian training and to ascertain the possibility of a distinctly Australian language of acting.

Veja mais