A novel concept-level approach for ultra-concise opinion summarization


Autoria(s): Lloret, Elena; Boldrini, Ester; Vodolazova, Tatiana; Martínez-Barco, Patricio; Muñoz, Rafael; Palomar, Manuel
Contribuinte(s)

Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos

Procesamiento del Lenguaje y Sistemas de Información (GPLSI)

Data(s)

16/09/2016

16/09/2016

15/11/2015

Resumo

The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.

This research work has been partially funded by the University of Alicante, Generalitat Valenciana, Spanish Government and the European Commission through the projects, “Tratamiento inteligente de la información para la ayuda a la toma de decisiones” (GRE12-44), “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15), DIIM2.0 (PROMETEOII/2014/001), ATTOS (TIN2012-38536-C03-03), LEGOLANG-UAGE (TIN2012-31224), SAM (FP7-611312), and FIRST (FP7-287607).

Identificador

Expert Systems with Applications. 2015, 42(20): 7148-7156. doi:10.1016/j.eswa.2015.05.026

0957-4174 (Print)

1873-6793 (Online)

http://hdl.handle.net/10045/57961

10.1016/j.eswa.2015.05.026

Idioma(s)

eng

Publicador

Elsevier

Relação

http://dx.doi.org/10.1016/j.eswa.2015.05.026

info:eu-repo/grantAgreement/EC/FP7/611312

Direitos

© 2015 Elsevier Ltd.

info:eu-repo/semantics/openAccess

Palavras-Chave #Text summarization #Ultra-concise opinion summarization #Electronic Word of Mouth #Natural language generation #Lenguajes y Sistemas Informáticos
Tipo

info:eu-repo/semantics/article