Factors influencing robustness and effectiveness of conditional random fields in active learning frameworks


Autoria(s): Kholghi, Mahnoosh; Sitbon, Laurianne; Zuccon, Guido; Nguyen, Anthony
Contribuinte(s)

Nayak, Richi

Li, Xue

Liu, Lin

Ong, Kok-Leong

Zhao, Yanchang

Kennedy, Paul

Data(s)

18/08/2014

Resumo

Active learning approaches reduce the annotation cost required by traditional supervised approaches to reach the same effectiveness by actively selecting informative instances during the learning phase. However, effectiveness and robustness of the learnt models are influenced by a number of factors. In this paper we investigate the factors that affect the effectiveness, more specifically in terms of stability and robustness, of active learning models built using conditional random fields (CRFs) for information extraction applications. Stability, defined as a small variation of performance when small variation of the training data or a small variation of the parameters occur, is a major issue for machine learning models, but even more so in the active learning framework which aims to minimise the amount of training data required. The factors we investigate are a) the choice of incremental vs. standard active learning, b) the feature set used as a representation of the text (i.e., morphological features, syntactic features, or semantic features) and c) Gaussian prior variance as one of the important CRFs parameters. Our empirical findings show that incremental learning and the Gaussian prior variance lead to more stable and robust models across iterations. Our study also demonstrates that orthographical, morphological and contextual features as a group of basic features play an important role in learning effective models across all iterations.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/79398/

Relação

http://eprints.qut.edu.au/79398/2/Factors_Influencing_Robustness_and_Effectiveness_of_Conditional_Random_Fields_in_Active_Learning_Frameworks.pdf

Kholghi, Mahnoosh, Sitbon, Laurianne, Zuccon, Guido, & Nguyen, Anthony (2014) Factors influencing robustness and effectiveness of conditional random fields in active learning frameworks. In Nayak, Richi, Li, Xue, Liu, Lin, Ong, Kok-Leong, Zhao, Yanchang, & Kennedy, Paul (Eds.) AusDM 2014 : The Twelfth Australasian Data Mining Conference, 27-28 November 2014, Queensland University of Technology, Gardens Point Campus, Brisbane, Australia.

Direitos

Copyright 2014 [please consult the authors]

Fonte

School of Electrical Engineering & Computer Science; School of Information Systems; Science & Engineering Faculty

Palavras-Chave #080000 INFORMATION AND COMPUTING SCIENCES #active learning #robustness #effectiveness #conditional random fields #Gaussian prior variance #concept extraction
Tipo

Conference Paper