515 resultados para NUDIST (Information retrieval system)
Resumo:
The Australian e-Health Research Centre and Queensland University of Technology recently participated in the TREC 2011 Medical Records Track. This paper reports on our methods, results and experience using a concept-based information retrieval approach. Our concept-based approach is intended to overcome specific challenges we identify in searching medical records. Queries and documents are transformed from their term-based originals into medical concepts as de ned by the SNOMED-CT ontology. Results show our concept-based approach performed above the median in all three performance metrics: bref (+12%), R-prec (+18%) and Prec@10 (+6%).
Resumo:
Search technologies are critical to enable clinical sta to rapidly and e ectively access patient information contained in free-text medical records. Medical search is challenging as terms in the query are often general but those in rel- evant documents are very speci c, leading to granularity mismatch. In this paper we propose to tackle granularity mismatch by exploiting subsumption relationships de ned in formal medical domain knowledge resources. In symbolic reasoning, a subsumption (or `is-a') relationship is a parent-child rela- tionship where one concept is a subset of another concept. Subsumed concepts are included in the retrieval function. In addition, we investigate a number of initial methods for combining weights of query concepts and those of subsumed concepts. Subsumption relationships were found to provide strong indication of relevant information; their inclusion in retrieval functions yields performance improvements. This result motivates the development of formal models of rela- tionships between medical concepts for retrieval purposes.
Resumo:
The Australian e-Health Research Centre and Queensland University of Technology recently participated in the TREC 2012 Medical Records Track. This paper reports on our methods, results and experience using an approach that exploits the concept and inter-concept relationships defined in the SNOMED CT medical ontology. Our concept-based approach is intended to overcome specific challenges in searching medical records, namely vocabulary mismatch and granularity mismatch. Queries and documents are transformed from their term-based originals into medical concepts as defined by the SNOMED CT ontology, this is done to tackle vocabulary mismatch. In addition, we make use of the SNOMED CT parent-child `is-a' relationships between concepts to weight documents that contained concept subsumed by the query concepts; this is done to tackle the problem of granularity mismatch. Finally, we experiment with other SNOMED CT relationships besides the is-a relationship to weight concepts related to query concepts. Results show our concept-based approach performed significantly above the median in all four performance metrics. Further improvements are achieved by the incorporation of weighting subsumed concepts, overall leading to improvement above the median of 28% infAP, 10% infNDCG, 12% R-prec and 7% Prec@10. The incorporation of other relations besides is-a demonstrated mixed results, more research is required to determined which SNOMED CT relationships are best employed when weighting related concepts.
Resumo:
IT-supported field data management benefits on-site construction management by improving accessibility to the information and promoting efficient communication between project team members. However, most of on-site safety inspections still heavily rely on subjective judgment and manual reporting processes and thus observers’ experiences often determine the quality of risk identification and control. This study aims to develop a methodology to efficiently retrieve safety-related information so that the safety inspectors can easily access to the relevant site safety information for safer decision making. The proposed methodology consists of three stages: (1) development of a comprehensive safety database which contains information of risk factors, accident types, impact of accidents and safety regulations; (2) identification of relationships among different risk factors based on statistical analysis methods; and (3) user-specified information retrieval using data mining techniques for safety management. This paper presents an overall methodology and preliminary results of the first stage research conducted with 101 accident investigation reports.
Resumo:
This paper presents a graph-based method to weight medical concepts in documents for the purposes of information retrieval. Medical concepts are extracted from free-text documents using a state-of-the-art technique that maps n-grams to concepts from the SNOMED CT medical ontology. In our graph-based concept representation, concepts are vertices in a graph built from a document, edges represent associations between concepts. This representation naturally captures dependencies between concepts, an important requirement for interpreting medical text, and a feature lacking in bag-of-words representations. We apply existing graph-based term weighting methods to weight medical concepts. Using concepts rather than terms addresses vocabulary mismatch as well as encapsulates terms belonging to a single medical entity into a single concept. In addition, we further extend previous graph-based approaches by injecting domain knowledge that estimates the importance of a concept within the global medical domain. Retrieval experiments on the TREC Medical Records collection show our method outperforms both term and concept baselines. More generally, this work provides a means of integrating background knowledge contained in medical ontologies into data-driven information retrieval approaches.
Resumo:
On August 16, 2012 the SIGIR 2012 Workshop on Open Source Information Retrieval was held as part of the SIGIR 2012 conference in Portland, Oregon, USA. There were 2 invited talks, one from industry and one from academia. There were 6 full papers and 6 short papers presented as well as demonstrations of 4 open source tools. Finally there was a lively discussion on future directions for the open source Information Retrieval community. This contribution discusses the events of the workshop and outlines future directions for the community.
Resumo:
Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.
Resumo:
This project was a step forward in developing and evaluating a novel, mathematical model that can deduce the meaning of words based on their use in language. This model can be applied to a wide range of natural language applications, including the information seeking process most of us undertake on a daily basis.
Resumo:
Information skills instruction for research candidates bas recently been formalised as coursework at the Queensland University of Technology. Feedback solicited from participants suggests that students benefit from such coursework in a number of ways. Their perception of the value of specific content areas to their literature review and thesis presentation is favourable. A small group of students who participated in Interviews identified five ways in which the coursework assisted the research process. As Instructors continue to work with the post·graduate community it would be useful to deepen our understanding of how such instruction is perceived and the benefits which can be derived from it.
Resumo:
Retrieval with Logical Imaging is derived from belief revision and provides a novel mechanism for estimating the relevance of a document through logical implication (i.e. P(q -> d)). In this poster, we perform the first comprehensive evaluation of Logical Imaging (LI) in Information Retrieval (IR) across several TREC test Collections. When compared against standard baseline models, we show that LI fails to improve performance. This failure can be attributed to a nuance within the model that means non-relevant documents are promoted in the ranking, while relevant documents are demoted. This is an important contribution because it not only contextualizes the effectiveness of LI, but crucially ex- plains why it fails. By addressing this nuance, future LI models could be significantly improved.
Resumo:
Quantum-inspired models have recently attracted increasing attention in Information Retrieval. An intriguing characteristic of the mathematical framework of quantum theory is the presence of complex numbers. However, it is unclear what such numbers could or would actually represent or mean in Information Retrieval. The goal of this paper is to discuss the role of complex numbers within the context of Information Retrieval. First, we introduce how complex numbers are used in quantum probability theory. Then, we examine van Rijsbergen’s proposal of evoking complex valued representations of informations objects. We empirically show that such a representation is unlikely to be effective in practice (confuting its usefulness in Information Retrieval). We then explore alternative proposals which may be more successful at realising the power of complex numbers.
Creation of a new evaluation benchmark for information retrieval targeting patient information needs
Resumo:
Searching for health advice on the web is becoming increasingly common. Because of the great importance of this activity for patients and clinicians and the effect that incorrect information may have on health outcomes, it is critical to present relevant and valuable information to a searcher. Previous evaluation campaigns on health information retrieval (IR) have provided benchmarks that have been widely used to improve health IR and record these improvements. However, in general these benchmarks have targeted the specialised information needs of physicians and other healthcare workers. In this paper, we describe the development of a new collection for evaluation of effectiveness in IR seeking to satisfy the health information needs of patients. Our methodology features a novel way to create statements of patients’ information needs using realistic short queries associated with patient discharge summaries, which provide details of patient disorders. We adopt a scenario where the patient then creates a query to seek information relating to these disorders. Thus, discharge summaries provide us with a means to create contextually driven search statements, since they may include details on the stage of the disease, family history etc. The collection will be used for the first time as part of the ShARe/-CLEF 2013 eHealth Evaluation Lab, which focuses on natural language processing and IR for clinical care.
Resumo:
Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.
Resumo:
This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2013. This evaluation lab focuses on improving access to medical information on the web. The task objective was to investigate the effect of using additional information such as the discharge summaries and external resources such as medical ontologies on the IR effectiveness. The participants were allowed to submit up to seven runs, one mandatory run using no additional information or external resources, and three each using or not using discharge summaries.
Resumo:
Early works on Private Information Retrieval (PIR) focused on minimizing the necessary communication overhead. They seemed to achieve this goal but at the expense of query response time. To mitigate this weakness, protocols with secure coprocessors were introduced. They achieve optimal communication complexity and better online processing complexity. Unfortunately, all secure coprocessor-based PIR protocols require heavy periodical preprocessing. In this paper, we propose a new protocol, which is free from the periodical preprocessing while offering the optimal communication complexity and almost optimal online processing complexity. The proposed protocol is proven to be secure.