An Accelerated Chow and Liu Algorithm: Fitting Tree Distributions to High Dimensional Sparse Data


Autoria(s): Meila, Marina
Data(s)

08/10/2004

08/10/2004

01/01/1999

Resumo

Chow and Liu introduced an algorithm for fitting a multivariate distribution with a tree (i.e. a density model that assumes that there are only pairwise dependencies between variables) and that the graph of these dependencies is a spanning tree. The original algorithm is quadratic in the dimesion of the domain, and linear in the number of data points that define the target distribution $P$. This paper shows that for sparse, discrete data, fitting a tree distribution can be done in time and memory that is jointly subquadratic in the number of variables and the size of the data set. The new algorithm, called the acCL algorithm, takes advantage of the sparsity of the data to accelerate the computation of pairwise marginals and the sorting of the resulting mutual informations, achieving speed ups of up to 2-3 orders of magnitude in the experiments.

Formato

1375477 bytes

434859 bytes

application/postscript

application/pdf

Identificador

AIM-1652

http://hdl.handle.net/1721.1/6676

Idioma(s)

en_US

Relação

AIM-1652