59 resultados para Xu, Zechun, 1787-1858.
Resumo:
The social tags in web 2.0 are becoming another important information source to profile users' interests and preferences to make personalized recommendations. To solve the problem of low information sharing caused by the free-style vocabulary of tags and the long tails of the distribution of tags and items, this paper proposes an approach to integrate the social tags given by users and the item taxonomy with standard vocabulary and hierarchical structure provided by experts to make personalized recommendations. The experimental results show that the proposed approach can effectively improve the information sharing and recommendation accuracy.
Resumo:
Recommender Systems is one of the effective tools to deal with information overload issue. Similar with the explicit rating and other implicit rating behaviours such as purchase behaviour, click streams, and browsing history etc., the tagging information implies user’s important personal interests and preferences information, which can be used to recommend personalized items to users. This paper is to explore how to utilize tagging information to do personalized recommendations. Based on the distinctive three dimensional relationships among users, tags and items, a new user profiling and similarity measure method is proposed. The experiments suggest that the proposed approach is better than the traditional collaborative filtering recommender systems using only rating data.
Resumo:
With the size and state of the Internet today, a good quality approach to organizing this mass of information is of great importance. Clustering web pages into groups of similar documents is one approach, but relies heavily on good feature extraction and document representation as well as a good clustering approach and algorithm. Due to the changing nature of the Internet, resulting in a dynamic dataset, an incremental approach is preferred. In this work we propose an enhanced incremental clustering approach to develop a better clustering algorithm that can help to better organize the information available on the Internet in an incremental fashion. Experiments show that the enhanced algorithm outperforms the original histogram based algorithm by up to 7.5%.
Resumo:
Association rule mining is one technique that is widely used when querying databases, especially those that are transactional, in order to obtain useful associations or correlations among sets of items. Much work has been done focusing on efficiency, effectiveness and redundancy. There has also been a focusing on the quality of rules from single level datasets with many interestingness measures proposed. However, with multi-level datasets now being common there is a lack of interestingness measures developed for multi-level and cross-level rules. Single level measures do not take into account the hierarchy found in a multi-level dataset. This leaves the Support-Confidence approach,which does not consider the hierarchy anyway and has other drawbacks, as one of the few measures available. In this paper we propose two approaches which measure multi-level association rules to help evaluate their interestingness. These measures of diversity and peculiarity can be used to help identify those rules from multi-level datasets that are potentially useful.
Resumo:
Association rule mining has made many advances in the area of knowledge discovery. However, the quality of the discovered association rules is a big concern and has drawn more and more attention recently. One problem with the quality of the discovered association rules is the huge size of the extracted rule set. Often for a dataset, a huge number of rules can be extracted, but many of them can be redundant to other rules and thus useless in practice. Mining non-redundant rules is a promising approach to solve this problem. In this paper, we firstly propose a definition for redundancy; then we propose a concise representation called Reliable basis for representing non-redundant association rules for both exact rules and approximate rules. An important contribution of this paper is that we propose to use the certainty factor as the criteria to measure the strength of the discovered association rules. With the criteria, we can determine the boundary between redundancy and non-redundancy to ensure eliminating as many redundant rules as possible without reducing the inference capacity of and the belief to the remaining extracted non-redundant rules. We prove that the redundancy elimination based on the proposed Reliable basis does not reduce the belief to the extracted rules. We also prove that all association rules can be deduced from the Reliable basis. Therefore the Reliable basis is a lossless representation of association rules. Experimental results show that the proposed Reliable basis can significantly reduce the number of extracted rules.
Resumo:
Recommender systems are widely used online to help users find other products, items etc that they may be interested in based on what is known about that user in their profile. Often however user profiles may be short on information and thus when there is not sufficient knowledge on a user it is difficult for a recommender system to make quality recommendations. This problem is often referred to as the cold-start problem. Here we investigate whether association rules can be used as a source of information to expand a user profile and thus avoid this problem, leading to improved recommendations to users. Our pilot study shows that indeed it is possible to use association rules to improve the performance of a recommender system. This we believe can lead to further work in utilising appropriate association rules to lessen the impact of the cold-start problem.
Resumo:
The Thai written language is one of the languages that does not have word boundaries. In order to discover the meaning of the document, all texts must be separated into syllables, words, sentences, and paragraphs. This paper develops a novel method to segment the Thai text by combining a non-dictionary based technique with a dictionary-based technique. This method first applies the Thai language grammar rules to the text for identifying syllables. The hidden Markov model is then used for merging possible syllables into words. The identified words are verified with a lexical dictionary and a decision tree is employed to discover the words unidentified by the lexical dictionary. Documents used in the litigation process of Thai court proceedings have been used in experiments. The results which are segmented words, obtained by the proposed method outperform the results obtained by other existing methods.