1 resultado para Warehouses
Resumo:
Association rule mining is an indispensable tool for discovering
insights from large databases and data warehouses.
The data in a warehouse being multi-dimensional, it is often
useful to mine rules over subsets of data defined by selections
over the dimensions. Such interactive rule mining
over multi-dimensional query windows is difficult since rule
mining is computationally expensive. Current methods using
pre-computation of frequent itemsets require counting
of some itemsets by revisiting the transaction database at
query time, which is very expensive. We develop a method
(RMW) that identifies the minimal set of itemsets to compute
and store for each cell, so that rule mining over any
query window may be performed without going back to the
transaction database. We give formal proofs that the set of
itemsets chosen by RMW is sufficient to answer any query
and also prove that it is the optimal set to be computed
for 1 dimensional queries. We demonstrate through an extensive
empirical evaluation that RMW achieves extremely
fast query response time compared to existing methods, with
only moderate overhead in pre-computation and storage