78 resultados para PBL tutorial search term
em Cambridge University Engineering Department Publications Database
Resumo:
Spoken content in languages of emerging importance needs to be searchable to provide access to the underlying information. In this paper, we investigate the problem of extending data fusion methodologies from Information Retrieval for Spoken Term Detection on low-resource languages in the framework of the IARPA Babel program. We describe a number of alternative methods improving keyword search performance. We apply these methods to Cantonese, a language that presents some new issues in terms of reduced resources and shorter query lengths. First, we show score normalization methodology that improves in average by 20% keyword search performance. Second, we show that properly combining the outputs of diverse ASR systems performs 14% better than the best normalized ASR system. © 2013 IEEE.
Resumo:
We present a system for keyword search on Cantonese conversational telephony audio, collected for the IARPA Babel program, that achieves good performance by combining postings lists produced by diverse speech recognition systems from three different research groups. We describe the keyword search task, the data on which the work was done, four different speech recognition systems, and our approach to system combination for keyword search. We show that the combination of four systems outperforms the best single system by 7%, achieving an actual term-weighted value of 0.517. © 2013 IEEE.
Resumo:
The development of high-performance speech processing systems for low-resource languages is a challenging area. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to use bottleneck features, or hybrid systems, trained on multilingual data for speech-to-text (STT) systems. This paper presents an investigation into the application of these multilingual approaches to spoken term detection. Experiments were run using the IARPA Babel limited language pack corpora (∼10 hours/language) with 4 languages for initial multilingual system development and an additional held-out target language. STT gains achieved through using multilingual bottleneck features in a Tandem configuration are shown to also apply to keyword search (KWS). Further improvements in both STT and KWS were observed by incorporating language questions into the Tandem GMM-HMM decision trees for the training set languages. Adapted hybrid systems performed slightly worse on average than the adapted Tandem systems. A language independent acoustic model test on the target language showed that retraining or adapting of the acoustic models to the target language is currently minimally needed to achieve reasonable performance. © 2013 IEEE.
Resumo:
Circadian clocks are 24-h timing devices that phase cellular responses; coordinate growth, physiology, and metabolism; and anticipate the day-night cycle. Here we report sensitivity of the Arabidopsis thaliana circadian oscillator to sucrose, providing evidence that plant metabolism can regulate circadian function. We found that the Arabidopsis circadian system is particularly sensitive to sucrose in the dark. These data suggest that there is a feedback between the molecular components that comprise the circadian oscillator and plant metabolism, with the circadian clock both regulating and being regulated by metabolism. We used also simulations within a three-loop mathematical model of the Arabidopsis circadian oscillator to identify components of the circadian clock sensitive to sucrose. The mathematical studies identified GIGANTEA (GI) as being associated with sucrose sensing. Experimental validation of this prediction demonstrated that GI is required for the full response of the circadian clock to sucrose. We demonstrate that GI acts as part of the sucrose-signaling network and propose this role permits metabolic input into circadian timing in Arabidopsis.