Discovery & effective use of quality association rules in multi-level datasets


Autoria(s): Shaw, Gavin
Data(s)

2010

Resumo

In today’s electronic world vast amounts of knowledge is stored within many datasets and databases. Often the default format of this data means that the knowledge within is not immediately accessible, but rather has to be mined and extracted. This requires automated tools and they need to be effective and efficient. Association rule mining is one approach to obtaining knowledge stored with datasets / databases which includes frequent patterns and association rules between the items / attributes of a dataset with varying levels of strength. However, this is also association rule mining’s downside; the number of rules that can be found is usually very big. In order to effectively use the association rules (and the knowledge within) the number of rules needs to be kept manageable, thus it is necessary to have a method to reduce the number of association rules. However, we do not want to lose knowledge through this process. Thus the idea of non-redundant association rule mining was born. A second issue with association rule mining is determining which ones are interesting. The standard approach has been to use support and confidence. But they have their limitations. Approaches which use information about the dataset’s structure to measure association rules are limited, but could yield useful association rules if tapped. Finally, while it is important to be able to get interesting association rules from a dataset in a manageable size, it is equally as important to be able to apply them in a practical way, where the knowledge they contain can be taken advantage of. Association rules show items / attributes that appear together frequently. Recommendation systems also look at patterns and items / attributes that occur together frequently in order to make a recommendation to a person. It should therefore be possible to bring the two together. In this thesis we look at these three issues and propose approaches to help. For discovering non-redundant rules we propose enhanced approaches to rule mining in multi-level datasets that will allow hierarchically redundant association rules to be identified and removed, without information loss. When it comes to discovering interesting association rules based on the dataset’s structure we propose three measures for use in multi-level datasets. Lastly, we propose and demonstrate an approach that allows for association rules to be practically and effectively used in a recommender system, while at the same time improving the recommender system’s performance. This especially becomes evident when looking at the user cold-start problem for a recommender system. In fact our proposal helps to solve this serious problem facing recommender systems.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/41731/

Publicador

Queensland University of Technology

Relação

http://eprints.qut.edu.au/41731/1/Gavin_Shaw_Thesis.pdf

Shaw, Gavin (2010) Discovery & effective use of quality association rules in multi-level datasets. PhD thesis, Queensland University of Technology.

Fonte

Faculty of Science and Technology; Information Systems

Palavras-Chave #association rules, multi-level association rules, multi-level datasets, redundancy, non-redundant association rules, interestingness measures, diversity measure, distance measure #recommender systems, cold-start problem, user profile expansion
Tipo

Thesis