Speaker Normalization Methods for Vowel Cognition: Comparative Analysis Using Neural Network and Nearest Neighbor Classifiers


Autoria(s): Carpenter, Gail A.; Govindarajan, Krishna K.
Data(s)

14/11/2011

14/11/2011

01/05/1993

Resumo

Intrinsic and extrinsic speaker normalization methods are systematically compared using a neural network (fuzzy ARTMAP) and L1 and L2 K-Nearest Neighbor (K-NN) categorizers trained and tested on disjoint sets of speakers of the Peterson-Barney vowel database. Intrinsic methods include one nonscaled, four psychophysical scales (bark, bark with endcorrection, mel, ERB), and three log scales, each tested on four combinations of F0 , F1, F2, F3. Extrinsic methods include four speaker adaptation schemes, each combined with the 32 intrinsic methods: centroid subtraction across all frequencies (CS), centroid subtraction for each frequency (CSi), linear scale (LS), and linear transformation (LT). ARTMAP and KNN show similar trends, with K-NN performing better, but requiring about ten times as much memory. The optimal intrinsic normalization method is bark scale, or bark with endcorrection, using the differences between all frequencies (Diff All). The order of performance for the extrinsic methods is LT, CSi, LS, and CS, with fuzzy ARTMAP performing best using bark scale with Diff All; and K-NN choosing psychophysical measures for all except CSi.

British Petroleum (89-A-1204); Defense Advanced Research Projects Agency (AFOSR-90-0083, ONR-N00014-92-J-4015); National Science Foundation (IRI-90-00530); Office of Naval Research (N00014-91-J-4100); Air Force Office of Scientific Research (F49620-92-J-0225)

Identificador

http://hdl.handle.net/2144/2016

Publicador

Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems

Relação

BU CAS/CNS Technical Reports;CAS/CNS-TR-1993-039

Direitos

Copyright 1993 Boston University. Permission to copy without fee all or part of this material is granted provided that: 1. The copies are not made or distributed for direct commercial advantage; 2. the report title, author, document number, and release date appear, and notice is given that copying is by permission of BOSTON UNIVERSITY TRUSTEES. To copy otherwise, or to republish, requires a fee and / or special permission.

Boston University Trustees

Tipo

Technical Report