Biblioteca Digital

Information Retrieval is an important albeit imperfect component of information technologies. A problem of insufficient diversity of retrieved documents is one of the primary issues studied in this research. This study shows that this problem leads to a decrease of precision and recall, traditional measures of information retrieval effectiveness. This thesis presents an adaptive IR system based on the theory of adaptive dual control. The aim of the approach is the optimization of retrieval precision after all feedback has been issued. This is done by increasing the diversity of retrieved documents. This study shows that the value of recall reflects this diversity. The Probability Ranking Principle is viewed in the literature as the “bedrock” of current probabilistic Information Retrieval theory. Neither the proposed approach nor other methods of diversification of retrieved documents from the literature conform to this principle. This study shows by counterexample that the Probability Ranking Principle does not in general lead to optimal precision in a search session with feedback (for which it may not have been designed but is actively used). Retrieval precision of the search session should be optimized with a multistage stochastic programming model to accomplish the aim. However, such models are computationally intractable. Therefore, approximate linear multistage stochastic programming models are derived in this study, where the multistage improvement of the probability distribution is modelled using the proposed feedback correctness method. The proposed optimization models are based on several assumptions, starting with the assumption that Information Retrieval is conducted in units of topics. The use of clusters is the primary reasons why a new method of probability estimation is proposed. The adaptive dual control of topic-based IR system was evaluated in a series of experiments conducted on the Reuters, Wikipedia and TREC collections of documents. The Wikipedia experiment revealed that the dual control feedback mechanism improves precision and S-recall when all the underlying assumptions are satisfied. In the TREC experiment, this feedback mechanism was compared to a state-of-the-art adaptive IR system based on BM-25 term weighting and the Rocchio relevance feedback algorithm. The baseline system exhibited better effectiveness than the cluster-based optimization model of ADTIR. The main reason for this was insufficient quality of the generated clusters in the TREC collection that violated the underlying assumption.

Veja mais

Efficient and accurate retrieval of business process models through indexing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent years have seen an increased uptake of business process management technology in industries. This has resulted in organizations trying to manage large collections of business process models. One of the challenges facing these organizations concerns the retrieval of models from large business process model repositories. For example, in some cases new process models may be derived from existing models, thus finding these models and adapting them may be more effective than developing them from scratch. As process model repositories may be large, query evaluation may be time consuming. Hence, we investigate the use of indexes to speed up this evaluation process. Experiments are conducted to demonstrate that our proposal achieves a significant reduction in query evaluation time.

Veja mais

Data collection, storage and retrieval with an underwater sensor network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a novel platform for underwater sensor networks to be used for long-term monitoring of coral reefs and �sheries. The sensor network consists of static and mobile underwater sensor nodes. The nodes communicate point-to-point using a novel high-speed optical communication system integrated into the TinyOS stack, and they broadcast using an acoustic protocol integrated in the TinyOS stack. The nodes have a variety of sensing capabilities, including cameras, water temperature, and pressure. The mobile nodes can locate and hover above the static nodes for data muling, and they can perform network maintenance functions such as deployment, relocation, and recovery. In this paper we describe the hardware and software architecture of this underwater sensor network. We then describe the optical and acoustic networking protocols and present experimental networking and data collected in a pool, in rivers, and in the ocean. Finally, we describe our experiments with mobility for data muling in this network.

Veja mais

Representation, construction, evolution and retrieval of fuzzy shapes to support conceptual design

Relevância:

20.00% 20.00%

Publicador:

Veja mais

From Greek Temple to Bird's Nest: Towards A Theory of Coherence and Mutual Accountability for National Integrity Systems

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Analysis of the effect of negation on information retrieval of medical data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most information retrieval (IR) models treat the presence of a term within a document as an indication that the document is somehow "about" that term, they do not take into account when a term might be explicitly negated. Medical data, by its nature, contains a high frequency of negated terms - e.g. "review of systems showed no chest pain or shortness of breath". This papers presents a study of the effects of negation on information retrieval. We present a number of experiments to determine whether negation has a significant negative affect on IR performance and whether language models that take negation into account might improve performance. We use a collection of real medical records as our test corpus. Our findings are that negation has some affect on system performance, but this will likely be confined to domains such as medical data where negation is prevalent.

Veja mais

A hybrid Chinese information retrieval model

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A distinctive feature of Chinese test is that a Chinese document is a sequence of Chinese with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach.

Veja mais

Current research in focused retrieval and result aggregation

Relevância:

20.00% 20.00%

Publicador:

Veja mais

From Greek temple to bird's nest : mapping, assessing and understanding national integrity systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, I would like to outline the approach we have taken to mapping and assessing integrity systems and how this has led us to see integrity systems in a new light. Indeed, it has led us to a new visual metaphor for integrity systems – a bird’s nest rather than a Greek temple. This was the result of a pair of major research projects completed in partnership with Transparency International (TI). One worked on refining and extending the measurement of corruption. This, the second, looked at what was then the emerging institutional means for reducing corruption – ‘national integrity systems’

Veja mais

Evaluating medical information retrieval

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a framework for evaluating information retrieval of medical records. We use the BLULab corpus, a large collection of real-world de-identified medical records. The collection has been hand coded by clinical terminol- ogists using the ICD-9 medical classification system. The ICD codes are used to devise queries and relevance judge- ments for this collection. Results of initial test runs using a baseline IR system are provided. Queries and relevance judgements are online to aid further research in medical IR. Please visit: http://koopman.id.au/med_eval.

Veja mais

Cervical spine disc prosthesis: Radiographic, biomechanical and morphological post mortal findings 12 weeks after implantation. A retrieval example

Relevância:

20.00% 20.00%

Publicador:

Veja mais

A landscape genetics approach for quantifying the relative influence of historic and contemporary habitat heterogeneity on the genetic connectivity of a rainforest bird

Relevância:

20.00% 20.00%

Publicador:

Veja mais

998 resultados para bird vocalisation retrieval

Filtro por publicador