Biblioteca Digital

MixKMeans: Clustering Question-Answer Archives

**Autoria(s):** Padmanabhan, Deepak
Data(s)	29/07/2016 31/12/1969
Resumo	Community-driven Question Answering (CQA) systems that crowdsource experiential information in the form of questions and answers and have accumulated valuable reusable knowledge. Clustering of QA datasets from CQA systems provides a means of organizing the content to ease tasks such as manual curation and tagging. In this paper, we present a clustering method that exploits the two-part question-answer structure in QA datasets to improve clustering quality. Our method, {\it MixKMeans}, composes question and answer space similarities in a way that the space on which the match is higher is allowed to dominate. This construction is motivated by our observation that semantic similarity between question-answer data (QAs) could get localized in either space. We empirically evaluate our method on a variety of real-world labeled datasets. Our results indicate that our method significantly outperforms state-of-the-art clustering methods for the task of clustering question-answer archives.
Identificador	http://pure.qub.ac.uk/portal/en/publications/mixkmeans-clustering-questionanswer-archives(8a7bdc12-8be3-4905-b238-c155fad7da29).html http://www.emnlp2016.net/accepted-papers.html
Idioma(s)	eng
Direitos	info:eu-repo/semantics/embargoedAccess
Fonte	Padmanabhan , D 2016 , MixKMeans: Clustering Question-Answer Archives . in Proceedings of the Conference on Empirical Methods in Natural Language Processing 2016 . Conference on Empirical Methods in Natural Language Processing , Austin , United States , 2-6 November .
Tipo	contributionToPeriodical

Acesso ao item digital