Biblioteca Digital

Performance analysis of algorithms for frequent pattern generation

**Autoria(s):** Islam, Md. Rafiqul; Chowdhury, Morshed; Khan, Safwan Mahmood
Contribuinte(s)	Stonier, Russel Han, Qinglong Li, Wei
Data(s)	01/01/2004
Resumo	Data mining refers to extracting or "mining" knowledge from large amounts of data. It is also called a method of "knowledge presentation" where visualization and knowledge representation techniques are used to present the mined knowledge to the user. Efficient algorithms to mine frequent patterns are crucial to many tasks in data mining. Since the Apriori algorithm was proposed in 1994, there have been several methods proposed to improve its performance. However, most still adopt its candidate set generation-and-test approach. In addition, many methods do not generate all frequent patterns, making them inadequate to derive association rules. The Pattern Decomposition (PD) algorithm that can significantly reduce the size of the dataset on each pass makes it more efficient to mine all frequent patterns in a large dataset. This algorithm avoids the costly process of candidate set generation and saves a large amount of counting time to evaluate support with reduced datasets. In this paper, some existing frequent pattern generation algorithms are explored and their comparisons are discussed. The results show that the PD algorithm outperforms an improved version of Apriori named Direct Count of candidates & Prune transactions (DCP) by one order of magnitude and is faster than an improved FP-tree named as Predictive Item Pruning (PIP). Further, PD is also more scalable than both DCP and PIP.<br />
Identificador	http://hdl.handle.net/10536/DRO/DU:30005388
Idioma(s)	eng
Publicador	Central Queensland University
Relação	http://dro.deakin.edu.au/eserv/DU:30005388/chowdhury-performanceanalysis-2004.pdf http://www.complexsystems.net.au/content/about_us
Palavras-Chave	#data mining #association rules #frequent Pattern #DCP algorithm #PIP algorithm #PD algorithm
Tipo	Conference Paper

Acesso ao item digital