879 resultados para Focused retrieval


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Searching for health advice on the web is becoming increasingly common. Because of the great importance of this activity for patients and clinicians and the effect that incorrect information may have on health outcomes, it is critical to present relevant and valuable information to a searcher. Previous evaluation campaigns on health information retrieval (IR) have provided benchmarks that have been widely used to improve health IR and record these improvements. However, in general these benchmarks have targeted the specialised information needs of physicians and other healthcare workers. In this paper, we describe the development of a new collection for evaluation of effectiveness in IR seeking to satisfy the health information needs of patients. Our methodology features a novel way to create statements of patients’ information needs using realistic short queries associated with patient discharge summaries, which provide details of patient disorders. We adopt a scenario where the patient then creates a query to seek information relating to these disorders. Thus, discharge summaries provide us with a means to create contextually driven search statements, since they may include details on the stage of the disease, family history etc. The collection will be used for the first time as part of the ShARe/-CLEF 2013 eHealth Evaluation Lab, which focuses on natural language processing and IR for clinical care.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2013. This evaluation lab focuses on improving access to medical information on the web. The task objective was to investigate the effect of using additional information such as the discharge summaries and external resources such as medical ontologies on the IR effectiveness. The participants were allowed to submit up to seven runs, one mandatory run using no additional information or external resources, and three each using or not using discharge summaries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Within Human-Computer Interaction (HCI) and Computer Supported Cooperative Work (CSCW) research, the notion of technologically-mediated awareness is often used for allowing relevant people to maintain a mental model of activities, behaviors and status information about each other so that they can organize and coordinate work or other joint activities. The initial conceptions of awareness focused largely on improving productivity and efficiency within work environments. With new social, cultural and commercial needs and the emergence of novel computing technologies, the focus of technologically-mediated awareness has extended from work environments to people’s everyday interactions. Hence, the scope of awareness has extended from conveying work related activities to people’s emotions, love, social status and other broad range of aspects. This trend of conceptualizing HCI design is termed as experience-focused HCI. In my PhD dissertation, designing for awareness, I have reported on how we, as HCI researchers, can design awareness systems from experience-focused HCI perspective that follow the trend of conveying awareness beyond the task-based, instrumental and productive needs. Within the overall aim to design for awareness, my research advocates ethnomethodologically-informed approaches for conceptualizing and designing for awareness. In this sense, awareness is not a predefined phenomenon but something that is situated and particular to a given environment. I have used this approach in two design cases of developing interactive systems that support awareness beyond task-based aspects in work environments. In both the cases, I have followed a complete design cycle: collecting an in-situ understanding of an environment, developing implications for a new technology, implementing a prototype technology to studying the use of the technology in its natural settings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the field of information retrieval (IR), researchers and practitioners are often faced with a demand for valid approaches to evaluate the performance of retrieval systems. The Cranfield experiment paradigm has been dominant for the in-vitro evaluation of IR systems. Alternative to this paradigm, laboratory-based user studies have been widely used to evaluate interactive information retrieval (IIR) systems, and at the same time investigate users’ information searching behaviours. Major drawbacks of laboratory-based user studies for evaluating IIR systems include the high monetary and temporal costs involved in setting up and running those experiments, the lack of heterogeneity amongst the user population and the limited scale of the experiments, which usually involve a relatively restricted set of users. In this paper, we propose an alternative experimental methodology to laboratory-based user studies. Our novel experimental methodology uses a crowdsourcing platform as a means of engaging study participants. Through crowdsourcing, our experimental methodology can capture user interactions and searching behaviours at a lower cost, with more data, and within a shorter period than traditional laboratory-based user studies, and therefore can be used to assess the performances of IIR systems. In this article, we show the characteristic differences of our approach with respect to traditional IIR experimental and evaluation procedures. We also perform a use case study comparing crowdsourcing-based evaluation with laboratory-based evaluation of IIR systems, which can serve as a tutorial for setting up crowdsourcing-based IIR evaluations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the following problem: members in a dynamic group retrieve their encrypted data from an untrusted server based on keywords and without any loss of data confidentiality and member’s privacy. In this paper, we investigate common secure indices for conjunctive keyword-based retrieval over encrypted data, and construct an efficient scheme from Wang et al. dynamic accumulator, Nyberg combinatorial accumulator and Kiayias et al. public-key encryption system. The proposed scheme is trapdoorless and keyword-field free. The security is proved under the random oracle, decisional composite residuosity and extended strong RSA assumptions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a study to understand the effect that negated terms (e.g., "no fever") and family history (e.g., "family history of diabetes") have on searching clinical records. Our analysis is aimed at devising the most effective means of handling negation and family history. In doing so, we explicitly represent a clinical record according to its different content types: negated, family history and normal content; the retrieval model weights each of these separately. Empirical evaluation shows that overall the presence of negation harms retrieval effectiveness while family history has little effect. We show negation is best handled by weighting negated content (rather than the common practise of removing or replacing it). However, we also show that many queries benefit from the inclusion of negated content and that negation is optimally handled on a per-query basis. Additional evaluation shows that adaptive handing of negated and family history content can have significant benefits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Relevation! is a system for performing relevance judgements for information retrieval evaluation. Relevation! is web-based, fully configurable and expandable; it allows researchers to effectively collect assessments and additional qualitative data. The system is easily deployed allowing assessors to smoothly perform their relevance judging tasks, even remotely. Relevation! is available as an open source project at: http://ielab.github.io/relevation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The top-k retrieval problem aims to find the optimal set of k documents from a number of relevant documents given the user’s query. The key issue is to balance the relevance and diversity of the top-k search results. In this paper, we address this problem using Facility Location Analysis taken from Operations Research, where the locations of facilities are optimally chosen according to some criteria. We show how this analysis technique is a generalization of state-of-the-art retrieval models for diversification (such as the Modern Portfolio Theory for Information Retrieval), which treat the top-k search results like “obnoxious facilities” that should be dispersed as far as possible from each other. However, Facility Location Analysis suggests that the top-k search results could be treated like “desirable facilities” to be placed as close as possible to their customers. This leads to a new top-k retrieval model where the best representatives of the relevant documents are selected. In a series of experiments conducted on two TREC diversity collections, we show that significant improvements can be made over the current state-of-the-art through this alternative treatment of the top-k retrieval problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP’s limitations. All alternatives deviate from the PRP by incorporating dependencies. This results in a re-ranking that promotes or demotes documents depending upon their relationship with the documents that have been already ranked. In this paper, we compare and contrast the behaviour of state-of-the-art ranking strategies and principles. To do so, we tease out analytical relationships between the ranking approaches and we investigate the document kinematics to visualise the effects of the different approaches on document ranking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the problem of document ranking in a non-traditional retrieval task, called subtopic retrieval. This task involves promoting relevant documents that cover many subtopics of a query at early ranks, providing thus diversity within the ranking. In the past years, several approaches have been proposed to diversify retrieval results. These approaches can be classified into two main paradigms, depending upon how the ranks of documents are revised for promoting diversity. In the first approach subtopic diversification is achieved implicitly, by choosing documents that are different from each other, while in the second approach this is done explicitly, by estimating the subtopics covered by documents. Within this context, we compare methods belonging to the two paradigms. Furthermore, we investigate possible strategies for integrating the two paradigms with the aim of formulating a new ranking method for subtopic retrieval. We conduct a number of experiments to empirically validate and contrast the state-of-the-art approaches as well as instantiations of our integration approach. The results show that the integration approach outperforms state-of-the-art strategies with respect to a number of measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we describe the approaches adopted to generate the runs submitted to ImageCLEFPhoto 2009 with an aim to promote document diversity in the rankings. Four of our runs are text based approaches that employ textual statistics extracted from the captions of images, i.e. MMR [1] as a state of the art method for result diversification, two approaches that combine relevance information and clustering techniques, and an instantiation of Quantum Probability Ranking Principle. The fifth run exploits visual features of the provided images to re-rank the initial results by means of Factor Analysis. The results reveal that our methods based on only text captions consistently improve the performance of the respective baselines, while the approach that combines visual features with textual statistics shows lower levels of improvements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While the Probability Ranking Principle for Information Retrieval provides the basis for formal models, it makes a very strong assumption regarding the dependence between documents. However, it has been observed that in real situations this assumption does not always hold. In this paper we propose a reformulation of the Probability Ranking Principle based on quantum theory. Quantum probability theory naturally includes interference effects between events. We posit that this interference captures the dependency between the judgement of document relevance. The outcome is a more sophisticated principle, the Quantum Probability Ranking Principle, that provides a more sensitive ranking which caters for interference/dependence between documents’ relevance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the following problem: users in a dynamic group store their encrypted documents on an untrusted server, and wish to retrieve documents containing some keywords without any loss of data confidentiality. In this paper, we investigate common secure indices which can make multi-users in a dynamic group to obtain securely the encrypted documents shared among the group members without re-encrypting them. We give a formal definition of common secure index for conjunctive keyword-based retrieval over encrypted data (CSI-CKR), define the security requirement for CSI-CKR, and construct a CSI-CKR based on dynamic accumulators, Paillier’s cryptosystem and blind signatures. The security of proposed scheme is proved under strong RSA and co-DDH assumptions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A dynamic accumulator is an algorithm, which merges a large set of elements into a constant-size value such that for an element accumulated, there is a witness confirming that the element was included into the value, with a property that accumulated elements can be dynamically added and deleted into/from the original set. Recently Wang et al. presented a dynamic accumulator for batch updates at ICICS 2007. However, their construction suffers from two serious problems. We analyze them and propose a way to repair their scheme. We use the accumulator to construct a new scheme for common secure indices with conjunctive keyword-based retrieval.