937 resultados para Information Retrieval


Relevância:

70.00% 70.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

It is a big challenge to clearly identify the boundary between positive and negative streams for information filtering systems. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on the RCV1 data collection, and substantial experiments show that the proposed approach achieves encouraging performance and the performance is also consistent for adaptive filtering as well.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

As a model for knowledge description and formalization, ontologies are widely used to represent user profiles in personalized web information gathering. However, when representing user profiles, many models have utilized only knowledge from either a global knowledge base or a user local information. In this paper, a personalized ontology model is proposed for knowledge representation and reasoning over user profiles. This model learns ontological user profiles from both a world knowledge base and user local instance repositories. The ontology model is evaluated by comparing it against benchmark models in web information gathering. The results show that this ontology model is successful.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mobile phones are now powerful and pervasive making them ideal information browsers. The Internet has revolutionized our lives and is a major knowledge sharing media. However, many mobile phone users cannot access the Internet (for financial or technical reasons) and so the mobile Internet has not been fully realized. We propose a novel content delivery network based on both a factual and speculative analysis of today’s technology and analyze its feasibility. If adopted people living in remote regions without Internet will be able to access essential (static) information with periodic updates.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Collaborative question answering (cQA) portals such as Yahoo! Answers allow users as askers or answer authors to communicate, and exchange information through the asking and answering of questions in the network. In their current set-up, answers to a question are arranged in chronological order. For effective information retrieval, it will be advantageous to have the users’ answers ranked according to their quality. This paper proposes a novel approach of evaluating and ranking the users’answers and recommending the top-n quality answers to information seekers. The proposed approach is based on a user-reputation method which assigns a score to an answer reflecting its answer author’s reputation level in the network. The proposed approach is evaluated on a dataset collected from a live cQA, namely, Yahoo! Answers. To compare the results obtained by the non-content-based user-reputation method, experiments were also conducted with several content-based methods that assign a score to an answer reflecting its content quality. Various combinations of non-content and content-based scores were also used in comparing results. Empirical analysis shows that the proposed method is able to rank the users’ answers and recommend the top-n answers with good accuracy. Results of the proposed method outperform the content-based methods, various combinations, and the results obtained by the popular link analysis method, HITS.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Purpose - The purpose of this paper is to examine post-graduate health promotion students’ self-perceptions of information literacy skills prior to, and after completing PILOT, an online information literacy tutorial. Design/methodology/approach – Post graduate students at Queensland University of Technology enrolled in PUP038 New Developments in Health Promotion completed a pre- and post- self-assessment questionnaire. From 2008-2011 students were required to rate their academic writing and research skills before and after completing the PILOT online information literacy tutorial. Quantitative trends and qualitative themes were analysed to establish students’ self-assessment and the effectiveness of the PILOT tutorial. Findings – The results from four years of post-graduate students’ self-assessment questionnaires provide evidence of perceived improvements in information literacy skills after completing PILOT. Some students continued to have trouble with locating quality information and analysis as well as issues surrounding referencing and plagiarism. Feedback was generally positive and students’ responses indicated they found the tutorial highly beneficial in improving their research skills. Originality/value - This paper is original because it describes post-graduate health promotion students’ self-assessment of information literacy skills over a period of four years. The literature is limited in the health promotion domain and self-assessment of post-graduate students’ information literacy skills. Keywords – Self-assessment, Post-graduate, Information literacy, Library instruction, Higher education, Health promotion, Evidence-based practice Paper Type - Research paper

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper addresses the issue of analogical inference, and its potential role as the mediator of new therapeutic discoveries, by using disjunction operators based on quantum connectives to combine many potential reasoning pathways into a single search expression. In it, we extend our previous work in which we developed an approach to analogical retrieval using the Predication-based Semantic Indexing (PSI) model, which encodes both concepts and the relationships between them in high-dimensional vector space. As in our previous work, we leverage the ability of PSI to infer predicate pathways connecting two example concepts, in this case comprising of known therapeutic relationships. For example, given that drug x TREATS disease z, we might infer the predicate pathway drug x INTERACTS WITH gene y ASSOCIATED WITH disease z, and use this pathway to search for drugs related to another disease in similar ways. As biological systems tend to be characterized by networks of relationships, we evaluate the ability of quantum-inspired operators to mediate inference and retrieval across multiple relations, by testing the ability of different approaches to recover known therapeutic relationships. In addition, we introduce a novel complex vector based implementation of PSI, based on Plate’s Circular Holographic Reduced Representations, which we utilize for all experiments in addition to the binary vector based approach we have applied in our previous research.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper outlines a novel approach for modelling semantic relationships within medical documents. Medical terminologies contain a rich source of semantic information critical to a number of techniques in medical informatics, including medical information retrieval. Recent research suggests that corpus-driven approaches are effective at automatically capturing semantic similarities between medical concepts, thus making them an attractive option for accessing semantic information. Most previous corpus-driven methods only considered syntagmatic associations. In this paper, we adapt a recent approach that explicitly models both syntagmatic and paradigmatic associations. We show that the implicit similarity between certain medical concepts can only be modelled using paradigmatic associations. In addition, the inclusion of both types of associations overcomes the sensitivity to the training corpus experienced by previous approaches, making our method both more effective and more robust. This finding may have implications for researchers in the area of medical information retrieval.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The cross-sections of the Social Web and the Semantic Web has put folksonomy in the spot light for its potential in overcoming knowledge acquisition bottleneck and providing insight for "wisdom of the crowds". Folksonomy which comes as the results of collaborative tagging activities has provided insight into user's understanding about Web resources which might be useful for searching and organizing purposes. However, collaborative tagging vocabulary poses some challenges since tags are freely chosen by users and may exhibit synonymy and polysemy problem. In order to overcome these challenges and boost the potential of folksonomy as emergence semantics we propose to consolidate the diverse vocabulary into a consolidated entities and concepts. We propose to extract a tag ontology by ontology learning process to represent the semantics of a tagging community. This paper presents a novel approach to learn the ontology based on the widely used lexical database WordNet. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. We provide empirical evaluations by using the semantic information contained in the ontology in a tag recommendation experiment. The results show that by using the semantic relationships on the ontology the accuracy of the tag recommender has been improved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.