A Boundary-Oriented Chinese Segmentation Method Using NGram Mutual Information
Contribuinte(s) |
Sun, L Chen, K.J. |
---|---|
Data(s) |
2010
|
Resumo |
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We implemented an n-gram mutual information (NGMI) based segmentation algorithm with the mixed-up features from unsupervised, supervised and dictionarybased segmentation methods. This algorithm is also combined with a simple strategy for out-of-vocabulary (OOV) word recognition. The evaluation for both open and closed training shows encouraging results of our system. The results for OOV word recognition in closed training evaluation were however found unsatisfactory. |
Formato |
application/pdf |
Identificador | |
Publicador |
Chinese Information Processing Society of China |
Relação |
http://eprints.qut.edu.au/80001/1/80001_tang_2011006236.pdf Tang, Ling-Xiang, Geva, Shlomo, Trotman, Andrew, & Xu, Yue (2010) A Boundary-Oriented Chinese Segmentation Method Using NGram Mutual Information. In Sun, L & Chen, K.J. (Eds.) Proceedings of the CIPS-SIGHAN Joint Conference on Chinese Language Processing, Chinese Information Processing Society of China, China. |
Fonte |
School of Electrical Engineering & Computer Science; Science & Engineering Faculty |
Palavras-Chave | #080100 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING #Translation; Chinese Segmentation #Boundary-Oriented Segmentation #Chinese language #N-Gram Mutual Information |
Tipo |
Conference Paper |