996 resultados para Relevance feedback


Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The INEX 2010 Focused Relevance Feedback track offered a refined approach to the evaluation of Focused Relevance Feedback algorithms through simulated exhaustive user feedback. As in traditional approaches we simulated a user-in-the loop by re-using the assessments of ad-hoc retrieval obtained from real users who assess focused ad-hoc retrieval submissions. The evaluation was extended in several ways: the use of exhaustive relevance feedback over entire runs; the evaluation of focused retrieval where both the retrieval results and the feedback are focused; the evaluation was performed over a closed set of documents and complete focused assessments; the evaluation was performed over executable implementations of relevance feedback algorithms; and �finally, the entire evaluation platform is reusable. We present the evaluation methodology, its implementation, and experimental results obtained for nine submissions from three participating organisations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The INEX 2011 Relevance Feedback track offered a refined approach to the evaluation of Focused Relevance Feedback algorithms through simulated exhaustive user feedback. Run in largely identical fashion to the Relevance Feedback track in INEX 2010[2], we simulated a user-in-the loop by re-using the assessments of ad-hoc retrieval obtained from real users who assess focused ad-hoc retrieval submissions. We present the evaluation methodology, its implementation, and experimental results obtained for four submissions from two participating organisations. As the task and evaluation methods did not change between INEX 2010 and now, explanations of these details from the INEX 2010 version of the track have been repeated verbatim where appropriate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a people-to-people matching systems, filtering is widely applied to find the most suitable matches. The results returned are either too many or only a few when the search is generic or specific respectively. The use of a sophisticated recommendation approach becomes necessary. Traditionally, the object of recommendation is the item which is inanimate. In online dating systems, reciprocal recommendation is required to suggest a partner only when the user and the recommended candidate both are satisfied. In this paper, an innovative reciprocal collaborative method is developed based on the idea of similarity and common neighbors, utilizing the information of relevance feedback and feature importance. Extensive experiments are carried out using data gathered from a real online dating service. Compared to benchmarking methods, our results show the proposed method can achieve noticeable better performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a pilot application based on web search engine calledWeb-based Relation Completion (WebRC), we propose to join two columns of entities linked by a predefined relation by mining knowledge from the web through a web search engine. To achieve this, a novel retrieval task Relation Query Expansion (RelQE) is modelled: given an entity (query), the task is to retrieve documents containing entities in predefined relation to the given one. Solving this problem entails expanding the query before submitting it to a web search engine to ensure that mostly documents containing the linked entity are returned in the top K search results. In this paper, we propose a novel Learning-based Relevance Feedback (LRF) approach to solve this retrieval task. Expansion terms are learned from training pairs of entities linked by the predefined relation and applied to new entity-queries to find entities linked by the same relation. After describing the approach, we present experimental results on real-world web data collections, which show that the LRF approach always improves the precision of top-ranked search results to up to 8.6 times the baseline. Using LRF, WebRC also shows performances way above the baseline.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multimedia mining primarily involves, information analysis and retrieval based on implicit knowledge. The ever increasing digital image databases on the Internet has created a need for using multimedia mining on these databases for effective and efficient retrieval of images. Contents of an image can be expressed in different features such as Shape, Texture and Intensity-distribution(STI). Content Based Image Retrieval(CBIR) is an efficient retrieval of relevant images from large databases based on features extracted from the image. Most of the existing systems either concentrate on a single representation of all features or linear combination of these features. The paper proposes a CBIR System named STIRF (Shape, Texture, Intensity-distribution with Relevance Feedback) that uses a neural network for nonlinear combination of the heterogenous STI features. Further the system is self-adaptable to different applications and users based upon relevance feedback. Prior to retrieval of relevant images, each feature is first clustered independent of the other in its own space and this helps in matching of similar images. Testing the system on a database of images with varied contents and intensive backgrounds showed good results with most relevant images being retrieved for a image query. The system showed better and more robust performance compared to existing CBIR systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Knowledge Systems Institute Graduate School

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ImageRover is a search by image content navigation tool for the world wide web. The staggering size of the WWW dictates certain strategies and algorithms for image collection, digestion, indexing, and user interface. This paper describes two key components of the ImageRover strategy: image digestion and relevance feedback. Image digestion occurs during image collection; robots digest the images they find, computing image decompositions and indices, and storing this extracted information in vector form for searches based on image content. Relevance feedback occurs during index search; users can iteratively guide the search through the selection of relevant examples. ImageRover employs a novel relevance feedback algorithm to determine the weighted combination of image similarity metrics appropriate for a particular query. ImageRover is available and running on the web site.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Various relevance feedback techniques have been applied in Content-Based Image Retrieval (CBIR). By using relevance feedback, CBIR allows the user to progressively refine the system's response to a query. In this paper, after analyzing the feature distributions of positive and negative feedbacks, a new parameter adjustment method for iteratively improving the query vector and adjusting the weights is proposed. Experimental results demonstrate the effectiveness of this method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a novel re-ranking method for content-based medical image retrieval based on the idea of pseudo-relevance feedback (PRF). Since the highest ranked images in original retrieval results are not always relevant, a naive PRF based re-ranking approach is not capable of producing a satisfactory result. We employ a two-step approach to address this issue. In step 1, a Pearson's correlation coefficient based similarity update method is used to re-rank the high ranked images. In step 2, after estimating a relevance probability for each of the highest ranked images, a fuzzy SVM ensemble based approach is adopted to re-rank the images. The experiments demonstrate that the proposed method outperforms two other re-ranking methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

 We review pseudo-relevance feedback as a mechanism for expanding short texts. Where short texts exhibit evolving concepts, topics and other characteristics, Web-based feedback systems were touted as the most ideal way of enriching the feature space of short texts. However, we note from a recent implementation of a Web-based pseudo-relevance feedback that it would only perform well under clinical situations. Further improvements to address fundamental noise in Web documents did not show significant improvements leading us to conclude that relevance feedback using Web documents directly are unsuitable for real-world conditions. In this paper, we present Eddi, which is a recent system that provides an exemplar of a typical pseudo-relevance feedback system. We first show the conditions in which Eddi will work and then discuss the situations where it would fail. We then present the variations to Eddi from our attempt to improve the robustness of Eddi's algorithm when dealing with complex Web documents. We then present the results from all variations to show the lack of robustness for pseudo-relevance feedback with Web documents.