Classification ensemble to improve medical named entity recognition


Autoria(s): Keretna,S; Lim,CP; Creighton,D; Shaban,KB
Contribuinte(s)

[Unknown]

Data(s)

01/01/2014

Resumo

An accurate Named Entity Recognition (NER) is important for knowledge discovery in text mining. This paper proposes an ensemble machine learning approach to recognise Named Entities (NEs) from unstructured and informal medical text. Specifically, Conditional Random Field (CRF) and Maximum Entropy (ME) classifiers are applied individually to the test data set from the i2b2 2010 medication challenge. Each classifier is trained using a different set of features. The first set focuses on the contextual features of the data, while the second concentrates on the linguistic features of each word. The results of the two classifiers are then combined. The proposed approach achieves an f-score of 81.8%, showing a considerable improvement over the results from CRF and ME classifiers individually which achieve f-scores of 76% and 66.3% for the same data set, respectively.

Identificador

http://hdl.handle.net/10536/DRO/DU:30070382

Idioma(s)

eng

Publicador

IEEE

Relação

http://dro.deakin.edu.au/eserv/DU:30070382/keretna-classificationensemble-2.pdf

http://dro.deakin.edu.au/eserv/DU:30070382/t035026-evid-confsmc-2014.pdf

http://dro.deakin.edu.au/eserv/DU:30070382/t035125-keretna-classificationensemble-e.pdf

http://www.dx.doi.org/10.1109/SMC.2014.6974324

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6974324

Direitos

2014, IEEE

Palavras-Chave #Machine learning #biomedical named entity recognition #conditional random field #information extraction #maximum entropy #medical text mining
Tipo

Conference Paper