28 resultados para Query expansion, Text mining, Information retrieval, Chinese IR
Resumo:
The main aim of the proposed approach presented in this paper is to improve Web information retrieval effectiveness by overcoming the problems associated with a typical keyword matching retrieval system, through the use of concepts and an intelligent fusion of confidence values. By exploiting the conceptual hierarchy of the WordNet (G. Miller, 1995) knowledge base, we show how to effectively encode the conceptual information in a document using the semantic information implied by the words that appear within it. Rather than treating a word as a string made up of a sequence of characters, we consider a word to represent a concept.
Resumo:
Formal Concept Analysis is an unsupervised machine learning technique that has successfully been applied to document organisation by considering documents as objects and keywords as attributes. The basic algorithms of Formal Concept Analysis then allow an intelligent information retrieval system to cluster documents according to keyword views. This paper investigates the scalability of this idea. In particular we present the results of applying spatial data structures to large datasets in formal concept analysis. Our experiments are motivated by the application of the Formal Concept Analysis idea of a virtual filesystem [11,17,15]. In particular the libferris [1] Semantic File System. This paper presents customizations to an RD-Tree Generalized Index Search Tree based index structure to better support the application of Formal Concept Analysis to large data sources.
Resumo:
This paper reports the introduction of an evidence-based medicine fellowship in a children’s teaching hospital. The results are presented of a self-reported ‘evidence-based medicine’ questionnaire, the clinical questions requested through the information retrieval service are outlined and the results of an information retrieval service user questionnaire are reported. It was confirmed that clinicians have frequent clinical questions that mostly remain unanswered. The responses to four questions with ‘good quality’ evidence-based answers were reviewed and suggest that at least one-quarter of doctors were not aware of the current best available evidence. There was a high level of satisfaction with the information retrieval service; 19% of users indicated that the information changed their clinical practice and 73% indicated that the information confirmed their clinical practice. The introduction of an evidence-based medicine fellowship is one method of disseminating the practice of evidence-based medicine in a tertiary children’s hospital.
Resumo:
This paper examines the effects of information request ambiguity and construct incongruence on end user's ability to develop SQL queries with an interactive relational database query language. In this experiment, ambiguity in information requests adversely affected accuracy and efficiency. Incongruities among the information request, the query syntax, and the data representation adversely affected accuracy, efficiency, and confidence. The results for ambiguity suggest that organizations might elicit better query development if end users were sensitized to the nature of ambiguities that could arise in their business contexts. End users could translate natural language queries into pseudo-SQL that could be examined for precision before the queries were developed. The results for incongruence suggest that better query development might ensue if semantic distances could be reduced by giving users data representations and database views that maximize construct congruence for the kinds of queries in typical domains. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
One of the goals of the ARC funded Eresearch project called Sharing access and analytical tools for ethnographic digital media using high speed networks, or simply EthnoER is to take outputs of normal linguistic analytical processes and present them online in a system we have called the EthnoER online presentation and annotation system, or EOPAS.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
Most Internet search engines are keyword-based. They are not efficient for the queries where geographical location is important, such as finding hotels within an area or close to a place of interest. A natural interface for spatial searching is a map, which can be used not only to display locations of search results but also to assist forming search conditions. A map-based search engine requires a well-designed visual interface that is intuitive to use yet flexible and expressive enough to support various types of spatial queries as well as aspatial queries. Similar to hyperlinks for text and images in an HTML page, spatial objects in a map should support hyperlinks. Such an interface needs to be scalable with the size of the geographical regions and the number of websites it covers. In spite of handling typically a very large amount of spatial data, a map-based search interface should meet the expectation of fast response time for interactive applications. In this paper we discuss general requirements and the design for a new map-based web search interface, focusing on integration with the WWW and visual spatial query interface. A number of current and future research issues are discussed, and a prototype for the University of Queensland is presented. (C) 2001 Published by Elsevier Science Ltd.