2 resultados para document
em Universidad de Alicante
Resumo:
This paper describes the first participation of IR-n system at Spoken Document Retrieval, focusing on the experiments we made before participation and showing the results we obtained. IR-n system is an Information Retrieval system based on passages and the recognition of sentences to define them. So, the main goal of this experiment is to adapt IR-n system to the spoken document structure by means of the utterance splitter and the overlapping passage technique allowing to match utterances and sentences.
Resumo:
This paper reports on the further results of the ongoing research analyzing the impact of a range of commonly used statistical and semantic features in the context of extractive text summarization. The features experimented with include word frequency, inverse sentence and term frequencies, stopwords filtering, word senses, resolved anaphora and textual entailment. The obtained results demonstrate the relative importance of each feature and the limitations of the tools available. It has been shown that the inverse sentence frequency combined with the term frequency yields almost the same results as the latter combined with stopwords filtering that in its turn proved to be a highly competitive baseline. To improve the suboptimal results of anaphora resolution, the system was extended with the second anaphora resolution module. The present paper also describes the first attempts of the internal document data representation.