The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms
Data(s) |
01/09/2010
|
---|---|
Resumo |
The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 hours of background noise was conducted across 10 unique locations covering 5 common noise scenarios, to create the QUT-NOISE corpus. This background noise corpus was then mixed with speech events chosen from the TIMIT clean speech corpus over a wide variety of noise lengths, signal-to-noise ratios (SNRs) and active speech proportions to form the mixed-speech QUT-NOISE-TIMIT corpus. The evaluation of five baseline VAD systems on the QUT-NOISE-TIMIT corpus is conducted to validate the data and show that the variety of noise available will allow for better evaluation of VAD systems than existing approaches in the literature. |
Formato |
application/pdf |
Identificador | |
Relação |
http://eprints.qut.edu.au/38144/1/c38144.pdf http://www.interspeech2010.org/ Dean, David B., Sridharan, Sridha, Vogt, Robert J., & Mason, Michael W. (2010) The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. In Proceedings of Interspeech 2010, Makuhari Messe International Convention Complex, Makuhari, Japan. |
Direitos |
Copyright 2010 [please consult the authors] |
Fonte |
Faculty of Built Environment and Engineering; Information Security Institute; School of Engineering Systems |
Palavras-Chave | #090609 Signal Processing #voice activity detection #speech databases #evaluation protocols |
Tipo |
Conference Paper |