31 resultados para coverage guarentee


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, this paper presents a novel form of language model, the paraphrastic LM. A phrase level transduction model that is statistically learned from standard text data is used to generate paraphrase variants. LM probabilities are then estimated by maximizing their marginal probability. Significant error rate reductions of 0.5%-0.6% absolute were obtained on a state-ofthe-art conversational telephone speech recognition task using a paraphrastic multi-level LM modelling both word and phrase sequences.