995 resultados para collaborative filtering


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Topic recommendation can help users deal with the information overload issue in micro-blogging communities. This paper proposes to use the implicit information network formed by the multiple relationships among users, topics and micro-blogs, and the temporal information of micro-blogs to find semantically and temporally relevant topics of each topic, and to profile users' time-drifting topic interests. The Content based, Nearest Neighborhood based and Matrix Factorization models are used to make personalized recommendations. The effectiveness of the proposed approaches is demonstrated in the experiments conducted on a real world dataset that collected from Twitter.com.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tag recommendation is a specific recommendation task for recommending metadata (tag) for a web resource (item) during user annotation process. In this context, sparsity problem refers to situation where tags need to be produced for items with few annotations or for user who tags few items. Most of the state of the art approaches in tag recommendation are rarely evaluated or perform poorly under this situation. This paper presents a combined method for mitigating sparsity problem in tag recommendation by mainly expanding and ranking candidate tags based on similar items’ tags and existing tag ontology. We evaluated the approach on two public social bookmarking datasets. The experiment results show better accuracy for recommendation in sparsity situation over several state of the art methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Online dating websites enable a specific form of social networking and their efficiency can be increased by supporting proactive recommendations based on participants' preferences with the use of data mining. This research develops two-way recommendation methods for people-to-people recommendation for large online social networks such as online dating networks. This research discovers the characteristics of the online dating networks and utilises these characteristics in developing efficient people-to-people recommendation methods. Methods developed support improved recommendation accuracy, can handle data sparsity that often comes with large data sets and are scalable for handling online networks with a large number of users.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based filtering approach tries to recommend items that are similar to new users' profiles. The fundamental issues include how to profile new users, and how to deal with the over-specialization in content-based recommender systems. Indeed, the terms used to describe items can be formed as a concept hierarchy. Therefore, we aim to describe user profiles or information needs by using concepts vectors. This paper presents a new method to acquire user information needs, which allows new users to describe their preferences on a concept hierarchy rather than rating items. It also develops a new ranking function to recommend items to new users based on their information needs. The proposed approach is evaluated on Amazon book datasets. The experimental results demonstrate that the proposed approach can largely improve the effectiveness of recommender systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Twitter is a very popular social network website that allows users to publish short posts called tweets. Users in Twitter can follow other users, called followees. A user can see the posts of his followees on his Twitter profile home page. An information overload problem arose, with the increase of the number of followees, related to the number of tweets available in the user page. Twitter, similar to other social network websites, attempts to elevate the tweets the user is expected to be interested in to increase overall user engagement. However, Twitter still uses the chronological order to rank the tweets. The tweets ranking problem was addressed in many current researches. A sub-problem of this problem is to rank the tweets for a single followee. In this paper we represent the tweets using several features and then we propose to use a weighted version of the famous voting system Borda-Count (BC) to combine several ranked lists into one. A gradient descent method and collaborative filtering method are employed to learn the optimal weights. We also employ the Baldwin voting system for blending features (or predictors). Finally we use the greedy feature selection algorithm to select the best combination of features to ensure the best results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

User profiling is the process of constructing user models which represent personal characteristics and preferences of customers. User profiles play a central role in many recommender systems. Recommender systems recommend items to users based on user profiles, in which the items can be any objects which the users are interested in, such as documents, web pages, books, movies, etc. In recent years, multidimensional data are getting more and more attention for creating better recommender systems from both academia and industry. Additional metadata provides algorithms with more details for better understanding the interactions between users and items. However, most of the existing user/item profiling techniques for multidimensional data analyze data through splitting the multidimensional relations, which causes information loss of the multidimensionality. In this paper, we propose a user profiling approach using a tensor reduction algorithm, which we will show is based on a Tucker2 model. The proposed profiling approach incorporates latent interactions between all dimensions into user profiles, which significantly benefits the quality of neighborhood formation. We further propose to integrate the profiling approach into neighborhoodbased collaborative filtering recommender algorithms. Experimental results show significant improvements in terms of recommendation accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In recommender systems based on multidimensional data, additional metadata provides algorithms with more information for better understanding the interaction between users and items. However, most of the profiling approaches in neighbourhood-based recommendation approaches for multidimensional data merely split or project the dimensional data and lack the consideration of latent interaction between the dimensions of the data. In this paper, we propose a novel user/item profiling approach for Collaborative Filtering (CF) item recommendation on multidimensional data. We further present incremental profiling method for updating the profiles. For item recommendation, we seek to delve into different types of relations in data to understand the interaction between users and items more fully, and propose three multidimensional CF recommendation approaches for top-N item recommendations based on the proposed user/item profiles. The proposed multidimensional CF approaches are capable of incorporating not only localized relations of user-user and/or item-item neighbourhoods but also latent interaction between all dimensions of the data. Experimental results show significant improvements in terms of recommendation accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We investigate methods for recommending multimedia items suitable for an online multimedia sharing community and introduce a novel algorithm called UserRank for ranking multimedia items based on link analysis. We also take the initiative of applying EigenRumor from the domain of blogosphere to multimedia. Furthermore, we present a strategy for making personalized recommendation that combines UserRank with collaborative filtering. We evaluate our method with an informal user study and show that results obtained are promising.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Our study concerns an important current problem, that of diffusion of information in social networks. This problem has received significant attention from the Internet research community in the recent times, driven by many potential applications such as viral marketing and sales promotions. In this paper, we focus on the target set selection problem, which involves discovering a small subset of influential players in a given social network, to perform a certain task of information diffusion. The target set selection problem manifests in two forms: 1) top-k nodes problem and 2) lambda-coverage problem. In the top-k nodes problem, we are required to find a set of k key nodes that would maximize the number of nodes being influenced in the network. The lambda-coverage problem is concerned with finding a set of k key nodes having minimal size that can influence a given percentage lambda of the nodes in the entire network. We propose a new way of solving these problems using the concept of Shapley value which is a well known solution concept in cooperative game theory. Our approach leads to algorithms which we call the ShaPley value-based Influential Nodes (SPINs) algorithms for solving the top-k nodes problem and the lambda-coverage problem. We compare the performance of the proposed SPIN algorithms with well known algorithms in the literature. Through extensive experimentation on four synthetically generated random graphs and six real-world data sets (Celegans, Jazz, NIPS coauthorship data set, Netscience data set, High-Energy Physics data set, and Political Books data set), we show that the proposed SPIN approach is more powerful and computationally efficient. Note to Practitioners-In recent times, social networks have received a high level of attention due to their proven ability in improving the performance of web search, recommendations in collaborative filtering systems, spreading a technology in the market using viral marketing techniques, etc. It is well known that the interpersonal relationships (or ties or links) between individuals cause change or improvement in the social system because the decisions made by individuals are influenced heavily by the behavior of their neighbors. An interesting and key problem in social networks is to discover the most influential nodes in the social network which can influence other nodes in the social network in a strong and deep way. This problem is called the target set selection problem and has two variants: 1) the top-k nodes problem, where we are required to identify a set of k influential nodes that maximize the number of nodes being influenced in the network and 2) the lambda-coverage problem which involves finding a set of influential nodes having minimum size that can influence a given percentage lambda of the nodes in the entire network. There are many existing algorithms in the literature for solving these problems. In this paper, we propose a new algorithm which is based on a novel interpretation of information diffusion in a social network as a cooperative game. Using this analogy, we develop an algorithm based on the Shapley value of the underlying cooperative game. The proposed algorithm outperforms the existing algorithms in terms of generality or computational complexity or both. Our results are validated through extensive experimentation on both synthetically generated and real-world data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when compared with other methods for inferring sparsity. We focus on unsupervised latent variable models, and develop L1 minimising factor models, Bayesian variants of "L1", and Bayesian models with a stronger L0-like sparsity induced through spike-and-slab distributions. These spike-and-slab Bayesian factor models encourage sparsity while accounting for uncertainty in a principled manner and avoiding unnecessary shrinkage of non-zero values. We demonstrate on a number of data sets that in practice spike-and-slab Bayesian methods outperform L1 minimisation, even on a computational budget. We thus highlight the need to re-assess the wide use of L1 methods in sparsity-reliant applications, particularly when we care about generalising to previously unseen data, and provide an alternative that, over many varying conditions, provides improved generalisation performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes a hierarchical probabilistic model for ordinal matrix factorization. Unlike previous approaches, we model the ordinal nature of the data and take a principled approach to incorporating priors for the hidden variables. Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes. Importantly, these algorithms may be implemented in the factorization of very large matrices with missing entries. The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies. The Netflix data set is used for evaluation, which consists of around 100 million ratings. Using root mean-squared error (RMSE) as an evaluation metric, results show that the suggested model outperforms alternative factorization techniques. Results also show how Gibbs sampling outperforms variational Bayes on this task, despite the large number of ratings and model parameters. Matlab implementations of the proposed algorithms are available from cogsys.imm.dtu.dk/ordinalmatrixfactorization.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

传统的协作过滤推荐方法主要基于个人兴趣特征来实现推荐。在组织内部协作场景下,为实现知识共享与重用,推荐系统不仅要考虑用户兴趣,还应考虑用户和用户组的任务。传统的协作过滤推荐方法已不能满足要求。CoP是组织内部人员管理的主要形式,它的特征是其成员任务的反映。基于已有的协作过滤推荐研究与D-S理论,提出了一种CoP特征构建算法,并以此为基础研究了面向CoP的协作过滤推荐。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

互联网个性化推荐系统(Internet personal recommender systems)是根据用户的兴趣推荐最相关的互联网信息给用户的系统。在网上信息过载矛盾越来越严重、用户信息检索的个性化需求日益增强的现状下,推荐系统已经在搜索引擎、电子商务、网上社区等互联网关键应用中起到了关键性的作用,并且越来越受到重视。 然而,在大型网站上部署一个成熟推荐系统的代价依然很大,需要大量的计算和存储资源,推荐的准确性也依然有很大提升空间和需求,这就为推荐系统的研究提供了很多挑战。在这些挑战中推荐算法的准确性和可扩展性一直是该领域最为关注的两个问题,所谓推荐的准确性是指推荐的信息中用户真正感兴趣的比例,而可扩展性指的是系统能否在可容忍的时间和空间复杂度内处理海量的数据。如何在提高算法推荐准确性的同时增强算法的可扩展性是推荐系统改进的主要研究目标。然而,目前学术界的研究更多侧重于提高推荐算法的准确性,而对于可扩展性,很多准确性很高的算法由于需要比较复杂的计算,处理大规模动态数据的能力往往比较有限,并且它们的评测实验中并没有将可扩展性纳入到评价范畴,导致这些算法目前还很难在工业界大规模应用。 本论文的研究试图解决这一问题。通过在推荐算法中借鉴增量学习(Incremental learning)的思想,即考虑最新的训练数据来更新原有的机器学习模型,不需要或仅需要参考部分旧的训练数据,相对于使用全部数据也即批量的处理方式,增量式改进可以大大降低模型更新的复杂度,从而可以大幅度提高推荐算法在遇到新的训练数据时推荐模型更新的效率,降低计算代价,使得推荐模型的更新可以更加及时,进而提高推荐结果的准确性。具体来说,我们在提出了两种新的增量式协同过滤算法的同时,采用增量式学习的方法对目前准确性最好的若干推荐算法进行加速,特别是提高这些算法面对新的训练数据的更新模型的速度和效率,从而为这些算法的大规模的应用提供了可能。另一方面,新的训练数据包含了最新的用户兴趣,因此相对于旧的训练数据,算法在做更新时应给予更高的权重,这样才能做到推荐的结果在考虑到用户长期兴趣的同时,特别考虑用户近期的兴趣,从而使得推荐结果更加准确。这两方面归纳起来,我们旨在通过增量式学习使得推荐算法在更新时更加高效和精确,真正适用于互联网上海量数据的推荐,同时对其他增量式推荐系统方面的研究也具有借鉴意义。我们的改进工作主要包括以下几个方面: 基于主题模型的增量式推荐算法。主题模型,特别是概率隐含主题模型(PLSA)是一种广泛应用于推荐系统的主流方法,在文本推荐、图像推荐以及协同过滤推荐领域都有着很好的推荐效果。目前制约PLSA算法取得更大成功的重要因素就是PLSA算法更新的复杂度过高,使得学习模型的更新只能做批量式处理,这样就导致推荐的时效性不高,也没有办法体现用户的最新的兴趣和整体的最新动态。我们提出了一种增量式学习方法,可以应用于文本分析领域和协同过滤领域,当有新的训练数据到来时,对于基于文本的推荐,增量式更新方法仅寻找最相关的用户和文本以及涉及到的单词进行主题分布的更新,并给予新的文本以更高权重;对于协同过滤,我们的方法仅对当前用户所评分过得物品以及当前物品所涉及的用户进行更新,大大降低了更新的运算复杂度,提高了新数据在推荐算法中所占的权重,使得推荐更加准确、及时。我们的算法在天涯问答文本数据集上和MovieLens电影推荐数据集、Last.FM歌曲推荐数据集、豆瓣图书推荐数据集等协同过滤数据集上取得了很好的效果。 基于蚁群算法(Ant colony algorithm)的协同过滤推荐方法。受到群体智能(Swarm intelligence)算法的启发,我们提出了一种类似于蚁群算法的协同过滤推荐方法——Ant Collaborative Filtering,初始化阶段该方法给予每个用户或一组用户以全局唯一的单位数量的信息素,当用户对物品评分或者用户表示对该物品感兴趣时,用户所携带的信息素相应的传播到该物品上,同时该物品上已有的信息素(初始化为0)也会相应的传播给该用户;此外,用户和物品所携带的信息素会随着时间的推移有一定速率的挥发,通过挥发机制,可以在推荐时更重视用户近期的兴趣;推荐阶段,按照用户和物品所携带的信息素的种类和数量,我们可以得到相应的相似度,进而通过经典的相似度比较的方法来进行推荐。基于蚁群的协同过滤方法的优势在于可以有效的降低训练数据中的稀疏性,并且推荐算法可以实时的进行更新和推荐,同时考虑了用户兴趣随着时间的变化。我们在MovieLens电影评分、豆瓣书籍推荐、Last.FM音乐推荐数据集上验证了我们的方法。最后,我们建立了一个互联网新闻推荐系统,该系统以Firefox插件形式实现,自动采集用户浏览兴趣和偏好,后端使用不同的推荐算法推荐用户感兴趣的新闻给用户。 基于联合聚类(Co-clustering)的两阶段协同过滤方法。聚类(Clustering)是一种缩小数据规模、降低数据稀疏性的有效方法。对于庞大而稀疏的协同过滤训练数据来说,聚类是一种很自然事实上也的确很有效的预处理方法。因此我们提出了一种两阶段协同过滤框架:首先通过我们提出的一种联合聚类的方法,将原始评分矩阵分解成很多维度很小的块,每一块里面包含相似的用户对相似的物品的评分,然后通过矩阵拟合的方法(我们使用了非负矩阵分解NMF和主题模型PLSA)来对这些小块中的未知评分进行预测。当用户新增了对于某物品的一条评分,我们仅需要更新该用户或该物品所处的数据块进行重新评分预估,大大加快了评分预估的速度。我们在MovieLens电影评分数据集上验证了该算法的效果。 本文的研究成果不仅可以直接应用于大型推荐系统中,而且对于增量式推荐系统的后续研究也具有一定的指导意义。首先基于PLSA的增量式推荐算法对于其他基于图模型的推荐系统具有借鉴价值,其次蚁群推荐算法为一类新的、基于群体智能(Swarm intellignece)的协同过滤算法做出了有价值的探索,最后我们提出的两阶段协同过滤框架对于提高推荐算法的可扩展性和更新效率提出了一个通用的有效解决方案。 推荐系统是一个无止尽的优化的过程,除了推荐精度的不断提高之外,推荐算法的性能随着互联网上数据量的增加也需要进一步提高,增量式学习无疑是提高推荐算法更新速度最重要的方法,本文的研究为这一方向提供了参考。