319 resultados para INFORMATION RETRIEVAL


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we consider the problem of document ranking in a non-traditional retrieval task, called subtopic retrieval. This task involves promoting relevant documents that cover many subtopics of a query at early ranks, providing thus diversity within the ranking. In the past years, several approaches have been proposed to diversify retrieval results. These approaches can be classified into two main paradigms, depending upon how the ranks of documents are revised for promoting diversity. In the first approach subtopic diversification is achieved implicitly, by choosing documents that are different from each other, while in the second approach this is done explicitly, by estimating the subtopics covered by documents. Within this context, we compare methods belonging to the two paradigms. Furthermore, we investigate possible strategies for integrating the two paradigms with the aim of formulating a new ranking method for subtopic retrieval. We conduct a number of experiments to empirically validate and contrast the state-of-the-art approaches as well as instantiations of our integration approach. The results show that the integration approach outperforms state-of-the-art strategies with respect to a number of measures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we describe the approaches adopted to generate the runs submitted to ImageCLEFPhoto 2009 with an aim to promote document diversity in the rankings. Four of our runs are text based approaches that employ textual statistics extracted from the captions of images, i.e. MMR [1] as a state of the art method for result diversification, two approaches that combine relevance information and clustering techniques, and an instantiation of Quantum Probability Ranking Principle. The fifth run exploits visual features of the provided images to re-rank the initial results by means of Factor Analysis. The results reveal that our methods based on only text captions consistently improve the performance of the respective baselines, while the approach that combines visual features with textual statistics shows lower levels of improvements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This project is a step forward in the study of text mining where enhanced text representation with semantic information plays a significant role. It develops effective methods of entity-oriented retrieval, semantic relation identification and text clustering utilizing semantically annotated data. These methods are based on enriched text representation generated by introducing semantic information extracted from Wikipedia into the input text data. The proposed methods are evaluated against several start-of-art benchmarking methods on real-life data-sets. In particular, this thesis improves the performance of entity-oriented retrieval, identifies different lexical forms for an entity relation and handles clustering documents with multiple feature spaces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic Vehicle Identification Systems are being increasingly used as a new source of travel information. As in the last decades these systems relied on expensive new technologies, few of them were scattered along a networks making thus Travel-Time and Average Speed estimation their main objectives. However, as their price dropped, the opportunity of building dense AVI networks arose, as in Brisbane where more than 250 Bluetooth detectors are now installed. As a consequence this technology represents an effective means to acquire accurate time dependant Origin Destination information. In order to obtain reliable estimations, however, a number of issues need to be addressed. Some of these problems stem from the structure of a network made out of isolated detectors itself while others are inherent of Bluetooth technology (overlapping detection area, missing detections,\...). The aim of this paper is threefold: First, after having presented the level of details that can be reached with a network of isolated detectors we present how we modelled Brisbane's network, keeping only the information valuable for the retrieval of trip information. Second, we give an overview of the issues inherent to the Bluetooth technology and we propose a method for retrieving the itineraries of the individual Bluetooth vehicles. Last, through a comparison with Brisbane Transport Strategic Model results, we highlight the opportunities and the limits of Bluetooth detectors networks. The aim of this paper is twofold. We first give a comprehensive overview of the aforementioned issues. Further, we propose a methodology that can be followed, in order to cleanse, correct and aggregate Bluetooth data. We postulate that the methods introduced by this paper are the first crucial steps that need to be followed in order to compute accurate Origin-Destination matrices in urban road networks.