Efficient learning of decomposable models with a bounded clique size


Autoria(s): Pérez Martínez, Aritz; Inza Cano, Iñaki; Lozano Alonso, José Antonio
Data(s)

08/05/2014

08/05/2014

08/05/2014

Resumo

The learning of probability distributions from data is a ubiquitous problem in the fields of Statistics and Artificial Intelligence. During the last decades several learning algorithms have been proposed to learn probability distributions based on decomposable models due to their advantageous theoretical properties. Some of these algorithms can be used to search for a maximum likelihood decomposable model with a given maximum clique size, k, which controls the complexity of the model. Unfortunately, the problem of learning a maximum likelihood decomposable model given a maximum clique size is NP-hard for k > 2. In this work, we propose a family of algorithms which approximates this problem with a computational complexity of O(k · n^2 log n) in the worst case, where n is the number of implied random variables. The structures of the decomposable models that solve the maximum likelihood problem are called maximal k-order decomposable graphs. Our proposals, called fractal trees, construct a sequence of maximal i-order decomposable graphs, for i = 2, ..., k, in k − 1 steps. At each step, the algorithms follow a divide-and-conquer strategy based on the particular features of this type of structures. Additionally, we propose a prune-and-graft procedure which transforms a maximal k-order decomposable graph into another one, increasing its likelihood. We have implemented two particular fractal tree algorithms called parallel fractal tree and sequential fractal tree. These algorithms can be considered a natural extension of Chow and Liu’s algorithm, from k = 2 to arbitrary values of k. Both algorithms have been compared against other efficient approaches in artificial and real domains, and they have shown a competitive behavior to deal with the maximum likelihood problem. Due to their low computational complexity they are especially recommended to deal with high dimensional domains.

Identificador

http://hdl.handle.net/10810/12361

Idioma(s)

eng

Relação

EHU-KZAA-TR;2014-07

Direitos

info:eu-repo/semantics/openAccess

Palavras-Chave #approximating probability distributions #decomposable models #bounded clique size #maximum likelihood problem #efficient algorithms #Chow and Liu's algorithm
Tipo

info:eu-repo/semantics/report