989 resultados para information filtering


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recommender systems are one of the recent inventions to deal with ever growing information overload. Collaborative filtering seems to be the most popular technique in recommender systems. With sufficient background information of item ratings, its performance is promising enough. But research shows that it performs very poor in a cold start situation where previous rating data is sparse. As an alternative, trust can be used for neighbor formation to generate automated recommendation. User assigned explicit trust rating such as how much they trust each other is used for this purpose. However, reliable explicit trust data is not always available. In this paper we propose a new method of developing trust networks based on user’s interest similarity in the absence of explicit trust data. To identify the interest similarity, we have used user’s personalized tagging information. This trust network can be used to find the neighbors to make automated recommendations. Our experiment result shows that the proposed trust based method outperforms the traditional collaborative filtering approach which uses users rating data. Its performance improves even further when we utilize trust propagation techniques to broaden the range of neighborhood.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a modified approach to evaluate access control policy similarity and dissimilarity based on the proposal by Lin et al. (2007). Lin et al.'s policy similarity approach is intended as a filter stage which identifies similar XACML policies that can be analysed further using more computationally demanding techniques based on model checking or logical reasoning. This paper improves the approach of computing similarity of Lin et al. and also proposes a mechanism to calculate a dissimilarity score by identifying related policies that are likely to produce different access decisions. Departing from the original algorithm, the modifications take into account the policy obligation, rule or policy combining algorithm and the operators between attribute name and value. The algorithms are useful in activities involving parties from multiple security domains such as secured collaboration or secured task distribution. The algorithms allow various comparison options for evaluating policies while retaining control over the restriction level via a number of thresholds and weight factors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information overload has become a serious issue for web users. Personalisation can provide effective solutions to overcome this problem. Recommender systems are one popular personalisation tool to help users deal with this issue. As the base of personalisation, the accuracy and efficiency of web user profiling affects the performances of recommender systems and other personalisation systems greatly. In Web 2.0, the emerging user information provides new possible solutions to profile users. Folksonomy or tag information is a kind of typical Web 2.0 information. Folksonomy implies the users‘ topic interests and opinion information. It becomes another source of important user information to profile users and to make recommendations. However, since tags are arbitrary words given by users, folksonomy contains a lot of noise such as tag synonyms, semantic ambiguities and personal tags. Such noise makes it difficult to profile users accurately or to make quality recommendations. This thesis investigates the distinctive features and multiple relationships of folksonomy and explores novel approaches to solve the tag quality problem and profile users accurately. Harvesting the wisdom of crowds and experts, three new user profiling approaches are proposed: folksonomy based user profiling approach, taxonomy based user profiling approach, hybrid user profiling approach based on folksonomy and taxonomy. The proposed user profiling approaches are applied to recommender systems to improve their performances. Based on the generated user profiles, the user and item based collaborative filtering approaches, combined with the content filtering methods, are proposed to make recommendations. The proposed new user profiling and recommendation approaches have been evaluated through extensive experiments. The effectiveness evaluation experiments were conducted on two real world datasets collected from Amazon.com and CiteULike websites. The experimental results demonstrate that the proposed user profiling and recommendation approaches outperform those related state-of-the-art approaches. In addition, this thesis proposes a parallel, scalable user profiling implementation approach based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The scalability evaluation experiments were conducted on a large scaled dataset collected from Del.icio.us website. This thesis contributes to effectively use the wisdom of crowds and expert to help users solve information overload issues through providing more accurate, effective and efficient user profiling and recommendation approaches. It also contributes to better usages of taxonomy information given by experts and folksonomy information contributed by users in Web 2.0.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new relationship type of social networks - online dating - are gaining popularity. With a large member base, users of a dating network are overloaded with choices about their ideal partners. Recommendation methods can be utilized to overcome this problem. However, traditional recommendation methods do not work effectively for online dating networks where the dataset is sparse and large, and a two-way matching is required. This paper applies social networking concepts to solve the problem of developing a recommendation method for online dating networks. We propose a method by using clustering, SimRank and adapted SimRank algorithms to recommend matching candidates. Empirical results show that the proposed method can achieve nearly double the performance of the traditional collaborative filtering and common neighbor methods of recommendation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, everyone can effortlessly access a range of information on the World Wide Web (WWW). As information resources on the web continue to grow tremendously, it becomes progressively more difficult to meet high expectations of users and find relevant information. Although existing search engine technologies can find valuable information, however, they suffer from the problems of information overload and information mismatch. This paper presents a hybrid Web Information Retrieval approach allowing personalised search using ontology, user profile and collaborative filtering. This approach finds the context of user query with least user’s involvement, using ontology. Simultaneously, this approach uses time-based automatic user profile updating with user’s changing behaviour. Subsequently, this approach uses recommendations from similar users using collaborative filtering technique. The proposed method is evaluated with the FIRE 2010 dataset and manually generated dataset. Empirical analysis reveals that Precision, Recall and F-Score of most of the queries for many users are improved with proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a novel approach to video deblocking which performs perceptually adaptive bilateral filtering by considering color, intensity, and motion features in a holistic manner. The method is based on bilateral filter which is an effective smoothing filter that preserves edges. The bilateral filter parameters are adaptive and avoid over-blurring of texture regions and at the same time eliminate blocking artefacts in the smooth region and areas of slow motion content. This is achieved by using a saliency map to control the strength of the filter for each individual point in the image based on its perceptual importance. The experimental results demonstrate that the proposed algorithm is effective in deblocking highly compressed video sequences and to avoid over-blurring of edges and textures in salient regions of image.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces PartSS, a new partition-based fil- tering for tasks performing string comparisons under edit distance constraints. PartSS offers improvements over the state-of-the-art method NGPP with the implementation of a new partitioning scheme and also improves filtering abil- ities by exploiting theoretical results on shifting and scaling ranges, thus accelerating the rate of calculating edit distance between strings. PartSS filtering has been implemented within two major tasks of data integration: similarity join and approximate membership extraction under edit distance constraints. The evaluation on an extensive range of real-world datasets demonstrates major gain in efficiency over NGPP and QGrams approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and thus help them in making good decisions about which product to buy from the vast number of product choices available to them. Many of the current recommender systems are developed for simple and frequently purchased products like books and videos, by using collaborative-filtering and content-based recommender system approaches. These approaches are not suitable for recommending luxurious and infrequently purchased products as they rely on a large amount of ratings data that is not usually available for such products. This research aims to explore novel approaches for recommending infrequently purchased products by exploiting user generated content such as user reviews and product click streams data. From reviews on products given by the previous users, association rules between product attributes are extracted using an association rule mining technique. Furthermore, from product click streams data, user profiles are generated using the proposed user profiling approach. Two recommendation approaches are proposed based on the knowledge extracted from these resources. The first approach is developed by formulating a new query from the initial query given by the target user, by expanding the query with the suitable association rules. In the second approach, a collaborative-filtering recommender system and search-based approaches are integrated within a hybrid system. In this hybrid system, user profiles are used to find the target user’s neighbour and the subsequent products viewed by them are then used to search for other relevant products. Experiments have been conducted on a real world dataset collected from one of the online car sale companies in Australia to evaluate the effectiveness of the proposed recommendation approaches. The experiment results show that user profiles generated from user click stream data and association rules generated from user reviews can improve recommendation accuracy. In addition, the experiment results also prove that the proposed query expansion and the hybrid collaborative filtering and search-based approaches perform better than the baseline approaches. Integrating the collaborative-filtering and search-based approaches has been challenging as this strategy has not been widely explored so far especially for recommending infrequently purchased products. Therefore, this research will provide a theoretical contribution to the recommender system field as a new technique of combining collaborative-filtering and search-based approaches will be developed. This research also contributes to a development of a new query expansion technique for infrequently purchased products recommendation. This research will also provide a practical contribution to the development of a prototype system for recommending cars.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and, thus, help in making good decisions about which product to buy from the vast amount of product choices. Many of the current recommender systems are developed for simple and frequently purchased products like books and videos, by using collaborative-filtering and content-based approaches. These approaches are not directly applicable for recommending infrequently purchased products such as cars and houses as it is difficult to collect a large number of ratings data from users for such products. Many of the ecommerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user’s query are retrieved and recommended. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their interest. In this article, a simple user profiling approach is proposed to generate user’s preferences to product attributes (i.e., user profiles) based on user product click stream data. The user profiles can be used to find similarminded users (i.e., neighbours) accurately. Two recommendation approaches are proposed, namely Round- Robin fusion algorithm (CFRRobin) and Collaborative Filtering-based Aggregated Query algorithm (CFAgQuery), to generate personalized recommendations based on the user profiles. Instead of using the target user’s query to search for products as normal search based systems do, the CFRRobin technique uses the attributes of the products in which the target user’s neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAgQuery technique uses the attributes of the products that the user’s neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAgQuery perform better than the standard Collaborative Filtering and the Basic Search approaches, which are widely applied by the current e-commerce applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Online dating websites enable a specific form of social networking and their efficiency can be increased by supporting proactive recommendations based on participants' preferences with the use of data mining. This research develops two-way recommendation methods for people-to-people recommendation for large online social networks such as online dating networks. This research discovers the characteristics of the online dating networks and utilises these characteristics in developing efficient people-to-people recommendation methods. Methods developed support improved recommendation accuracy, can handle data sparsity that often comes with large data sets and are scalable for handling online networks with a large number of users.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based filtering approach tries to recommend items that are similar to new users' profiles. The fundamental issues include how to profile new users, and how to deal with the over-specialization in content-based recommender systems. Indeed, the terms used to describe items can be formed as a concept hierarchy. Therefore, we aim to describe user profiles or information needs by using concepts vectors. This paper presents a new method to acquire user information needs, which allows new users to describe their preferences on a concept hierarchy rather than rating items. It also develops a new ranking function to recommend items to new users based on their information needs. The proposed approach is evaluated on Amazon book datasets. The experimental results demonstrate that the proposed approach can largely improve the effectiveness of recommender systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper conditional hidden Markov model (HMM) filters and conditional Kalman filters (KF) are coupled together to improve demodulation of differential encoded signals in noisy fading channels. We present an indicator matrix representation for differential encoded signals and the optimal HMM filter for demodulation. The filter requires O(N3) calculations per time iteration, where N is the number of message symbols. Decision feedback equalisation is investigated via coupling the optimal HMM filter for estimating the message, conditioned on estimates of the channel parameters, and a KF for estimating the channel states, conditioned on soft information message estimates. The particular differential encoding scheme examined in this paper is differential phase shift keying. However, the techniques developed can be extended to other forms of differential modulation. The channel model we use allows for multiplicative channel distortions and additive white Gaussian noise. Simulation studies are also presented.