6 resultados para Japanese language -- Orthography and spelling

em Cambridge University Engineering Department Publications Database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, paraphrastic LMs were proposed in previous research and successfully applied to a US English conversational telephone speech transcription task. In order to exploit the complementary characteristics of paraphrastic LMs and neural network LMs (NNLM), the combination between the two is investigated in this paper. To investigate paraphrastic LMs' generalization ability to other languages, experiments are conducted on a Mandarin Chinese broadcast speech transcription task. Using a paraphrastic multi-level LM modelling both word and phrase sequences, significant error rate reductions of 0.9% absolute (9% relative) and 0.5% absolute (5% relative) were obtained over the baseline n-gram and NNLM systems respectively, after a combination with word and phrase level NNLMs. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing devices for communicating information to computers are bulky, slow to use, or unreliable. Dasher is a new interface incorporating language modelling and driven by continuous two-dimensional gestures, e.g. a mouse, touchscreen, or eye-tracker. Tests have shown that this device can be used to enter text at a rate of up to 34 words per minute, compared with typical ten-finger keyboard typing of 40-60 words per minute. Although the interface is slower than a conventional keyboard, it is small and simple, and could be used on personal data assistants and by motion-impaired computer users.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a new online psycholinguistic resource for Greek based on analyses of written corpora combined with text processing technologies developed at the Institute for Language & Speech Processing (ILSP), Greece. The "ILSP PsychoLinguistic Resource" (IPLR) is a freely accessible service via a dedicated web page, at http://speech.ilsp.gr/iplr. IPLR provides analyses of user-submitted letter strings (words and nonwords) as well as frequency tables for important units and conditions such as syllables, bigrams, and neighbors, calculated over two word lists based on printed text corpora and their phonetic transcription. Online tools allow retrieval of words matching user-specified orthographic or phonetic patterns. All results and processing code (in the Python programming language) are freely available for noncommercial educational or research use. © 2010 Springer Science+Business Media B.V.