190 resultados para collaborative filtering

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Collaborative filtering is an effective recommendation technique wherein the preference of an individual can potentially be predicted based on preferences of other members. Early algorithms often relied on the strong locality in the preference data, that is, it is enough to predict preference of a user on a particular item based on a small subset of other users with similar tastes or of other items with similar properties. More recently, dimensionality reduction techniques have proved to be equally competitive, and these are based on the co-occurrence patterns rather than locality. This paper explores and extends a probabilistic model known as Boltzmann Machine for collaborative filtering tasks. It seamlessly integrates both the similarity and cooccurrence in a principled manner. In particular, we study parameterisation options to deal with the ordinal nature of the preferences, and propose a joint modelling of both the user-based and item-based processes. Experiments on moderate and large-scale movie recommendation show that our framework rivals existing well-known methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ranking is an important task for handling a large amount of content. Ideally, training data for supervised ranking would include a complete rank of documents (or other objects such as images or videos) for a particular query. However, this is only possible for small sets of documents. In practice, one often resorts to document rating, in that a subset of documents is assigned with a small number indicating the degree of relevance. This poses a general problem of modelling and learning rank data with ties. In this paper, we propose a probabilistic generative model, that models the process as permutations over partitions. This results in super-exponential combinatorial state space with unknown numbers of partitions and unknown ordering among them. We approach the problem from the discrete choice theory, where subsets are chosen in a stagewise manner, reducing the state space per each stage significantly. Further, we show that with suitable parameterisation, we can still learn the models in linear time. We evaluate the proposed models on two application areas: (i) document ranking with the data from the recently held Yahoo! challenge, and (ii) collaborative filtering with movie data. The results demonstrate that the models are competitive against well-known rivals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As each user tends to rate a small proportion of available items, the resulted Data Sparsity issue brings significant challenges to the research of recommender systems. This issue becomes even more severe for neighborhood-based collaborative filtering methods, as there are even lower numbers of ratings available in the neighborhood of the query item. In this paper, we aim to address the Data Sparsity issue in the context of the neighborhood-based collaborative filtering. Given the (user, item) query, a set of key ratings are identified, and an auto-adaptive imputation method is proposed to fill the missing values in the set of key ratings. The proposed method can be used with any similarity metrics, such as the Pearson Correlation Coefficient and Cosine-based similarity, and it is theoretically guaranteed to outperform the neighborhood-based collaborative filtering approaches. Results from experiments prove that the proposed method could significantly improve the accuracy of recommendations for neighborhood-based Collaborative Filtering algorithms. © 2012 ACM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the context of collaborative filtering, the well known data sparsity issue makes two like-minded users have little similarity, and consequently renders the k nearest neighbour rule inapplicable. In this paper, we address the data sparsity problem in the neighbourhood-based CF methods by proposing an Adaptive-Maximum imputation method (AdaM). The basic idea is to identify an imputation area that can maximize the imputation benefit for recommendation purposes, while minimizing the imputation error brought in. To achieve the maximum imputation benefit, the imputation area is determined from both the user and the item perspectives; to minimize the imputation error, there is at least one real rating preserved for each item in the identified imputation area. A theoretical analysis is provided to prove that the proposed imputation method outperforms the conventional neighbourhood-based CF methods through more accurate neighbour identification. Experiment results on benchmark datasets show that the proposed method significantly outperforms the other related state-of-the-art imputation-based methods in terms of accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a popular technique in recommender systems, Collaborative Filtering (CF) has received extensive attention in recent years. However, its privacy-related issues, especially for neighborhood-based CF methods, can not be overlooked. The aim of this study is to address the privacy issues in the context of neighborhood-based CF methods by proposing a Private Neighbor Collaborative Filtering (PNCF) algorithm. The algorithm includes two privacy-preserving operations: Private Neighbor Selection and Recommendation-Aware Sensitivity. Private Neighbor Selection is constructed on the basis of the notion of differential privacy to privately choose neighbors. Recommendation-Aware Sensitivity is introduced to enhance the performance of recommendations. Theoretical and experimental analysis are provided to show the proposed algorithm can preserve differential privacy while retaining the accuracy of recommendations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a popular technique in recommender systems, Collaborative Filtering (CF) has been the focus of significant attention in recent years, however, its privacy-related issues, especially for the neighborhood-based CF methods, cannot be overlooked. The aim of this study is to address these privacy issues in the context of neighborhood-based CF methods by proposing a Private Neighbor Collaborative Filtering (PNCF) algorithm. This algorithm includes two privacy preserving operations: Private Neighbor Selection and Perturbation. Using the item-based method as an example, Private Neighbor Selection is constructed on the basis of the notion of differential privacy, meaning that neighbors are privately selected for the target item according to its similarities with others. Recommendation-Aware Sensitivity and a re-designed differential privacy mechanism are introduced in this operation to enhance the performance of recommendations. A Perturbation operation then hides the true ratings of selected neighbors by adding Laplace noise. The PNCF algorithm reduces the magnitude of the noise introduced from the traditional differential privacy mechanism. Moreover, a theoretical analysis is provided to show that the proposed algorithm can resist a KNN attack while retaining the accuracy of recommendations. The results from experiments on two real datasets show that the proposed PNCF algorithm can obtain a rigid privacy guarantee without high accuracy loss. © 2013 Published by Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning preference models from human generated data is an important task in modern information processing systems. Its popular setting consists of simple input ratings, assigned with numerical values to indicate their relevancy with respect to a specific query. Since ratings are often specified within a small range, several objects may have the same ratings, thus creating ties among objects for a given query. Dealing with this phenomena presents a general problem of modelling preferences in the presence of ties and being query-specific. To this end, we present in this paper a novel approach by constructing probabilistic models directly on the collection of objects exploiting the combinatorial structure induced by the ties among them. The proposed probabilistic setting allows exploration of a super-exponential combinatorial state-space with unknown numbers of partitions and unknown order among them. Learning and inference in such a large state-space are challenging, and yet we present in this paper efficient algorithms to perform these tasks. Our approach exploits discrete choice theory, imposing generative process such that the finite set of objects is partitioned into subsets in a stagewise procedure, and thus reducing the state-space at each stage significantly. Efficient Markov chain Monte Carlo algorithms are then presented for the proposed models. We demonstrate that the model can potentially be trained in a large-scale setting of hundreds of thousands objects using an ordinary computer. In fact, in some special cases with appropriate model specification, our models can be learned in linear time. We evaluate the models on two application areas: (i) document ranking with the data from the Yahoo! challenge and (ii) collaborative filtering with movie data. We demonstrate that the models are competitive against state-of-the-arts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Privacy preserving is an essential aspect of modern recommender systems. However, the traditional approaches can hardly provide a rigid and provable privacy guarantee for recommender systems, especially for those systems based on collaborative filtering (CF) methods. Recent research revealed that by observing the public output of the CF, the adversary could infer the historical ratings of the particular user, which is known as the KNN attack and is considered a serious privacy violation for recommender systems. This paper addresses the privacy issue in CF by proposing a Private Neighbor Collaborative Filtering (PriCF) algorithm, which is constructed on the basis of the notion of differential privacy. PriCF contains an essential privacy operation, Private Neighbor Selection, in which the Laplace noise is added to hide the identity of neighbors and the ratings of each neighbor. To retain the utility, the Recommendation-Aware Sensitivity and a re-designed truncated similarity are introduced to enhance the performance of recommendations. A theoretical analysis shows that the proposed algorithm can resist the KNN attack while retaining the accuracy of recommendations. The experimental results on two real datasets show that the proposed PriCF algorithm retains most of the utility with a fixed privacy budget.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recommender systems play a central role in providing individualized access to information and services. This paper focuses on collaborative filtering, an approach that exploits the shared structure among mind-liked users and similar items. In particular, we focus on a formal probabilistic framework known as Markov random fields (MRF). We address the open problem of structure learning and introduce a sparsity-inducing algorithm to automatically estimate the interaction structures between users and between items. Item-item and user-user correlation networks are obtained as a by-product. Large-scale experiments on movie recommendation and date matching datasets demonstrate the power of the proposed method.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed 5 stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and ordering the subsets. Here we construct a probabilistic log-linear model over a set of ordered subsets. Inference in this combinatorial space is highly challenging: The space size approaches (N!/2)6.93145N+1 as N approaches infinity. We propose a split-and-merge Metropolis-Hastings procedure that can explore the state-space efficiently. For discovering hidden aspects in the data, we enrich the model with latent binary variables so that the posteriors can be efficiently evaluated. Finally, we evaluate the proposed model on large-scale collaborative filtering tasks and demonstrate that it is competitive against state-of-the-art methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recommender systems are important to help users select relevant and personalised information over massive amounts of data available. We propose an unified framework called Preference Network (PN) that jointly models various types of domain knowledge for the task of recommendation. The PN is a probabilistic model that systematically combines both content-based filtering and collaborative filtering into a single conditional Markov random field. Once estimated, it serves as a probabilistic database that supports various useful queries such as rating prediction and top-N recommendation. To handle the challenging problem of learning large networks of users and items, we employ a simple but effective pseudo-likelihood with regularisation. Experiments on the movie rating data demonstrate the merits of the PN.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ordinal data is omnipresent in almost all multiuser-generated feedback - questionnaires, preferences etc. This paper investigates modelling of ordinal data with Gaussian restricted Boltzmann machines (RBMs). In particular, we present the model architecture, learning and inference procedures for both vector-variate and matrix-variate ordinal data. We show that our model is able to capture latent opinion profile of citizens around the world, and is competitive against state-of-art collaborative filtering techniques on large-scale public datasets. The model thus has the potential to extend application of RBMs to diverse domains such as recommendation systems, product reviews and expert assessments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis focuses on the data sparsity issue and the temporal dynamic issue in the context of collaborative filtering, and addresses them with imputation techniques, low-rank subspace techniques and optimizations techniques from the machine learning perspective. A comprehensive survey on the development of collaborative filtering techniques is also included.