An evaluation framework for cross-lingual link discovery


Autoria(s): Tang, Ling-Xiang; Geva, Shlomo; Trotman, Andrew; Xu, Yue; Itakura, Kelly
Data(s)

2013

Resumo

Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9. This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/61914/

Publicador

Elsevier Ltd

Relação

http://eprints.qut.edu.au/61914/1/2013-1.pdf

DOI:10.1016/j.ipm.2013.07.003

Tang, Ling-Xiang, Geva, Shlomo, Trotman, Andrew, Xu, Yue, & Itakura, Kelly (2013) An evaluation framework for cross-lingual link discovery. Information Processing & Management, 50(1), pp. 1-23.

Direitos

Copyright 2013 Elsevier Ltd.

This is the author’s version of a work that was accepted for publication in Information Processing & Management. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information Processing & Management, [VOL 50, ISSUE 1, (2013)] DOI: 10.1016/j.ipm.2013.07.003

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #Assessment #Cross-lingual link discovery #Evaluation framework #Evaluation metrics #Validation #Wikipedia
Tipo

Journal Article