Biblioteca Digital

An investigation into feature construction to assist word sense disambiguation

**Autoria(s):** SPECIA, Lucia; SRINIVASAN, Ashwin; JOSHI, Sachindra; RAMAKRISHNAN, Ganesh; NUNES, Maria das Gracas Volpe
Contribuinte(s)	UNIVERSIDADE DE SÃO PAULO
Data(s)	20/10/2012 20/10/2012 2009
Resumo	Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.
Identificador	MACHINE LEARNING, v.76, n.1, p.109-136, 2009 0885-6125 http://producao.usp.br/handle/BDPI/28774 10.1007/s10994-009-5114-x http://dx.doi.org/10.1007/s10994-009-5114-x
Idioma(s)	eng
Publicador	SPRINGER
Relação	Machine Learning
Direitos	restrictedAccess Copyright SPRINGER
Palavras-Chave	#ILP #Word sense disambiguation #Feature construction #Randomised search #PROPOSITIONALIZATION #ILP #Computer Science, Artificial Intelligence
Tipo	article proceedings paper publishedVersion

Acesso ao item digital