Signal propagation in Bayesian networks and its relationship with intrinsically multivariate predictive variables


Autoria(s): Martins Junior, David C.; Oliveira, Evaldo A. de; Braga-Neto, Ulisses M.; Hashimoto, Ronaldo Fumio; Cesar Junior, Roberto Marcondes
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

05/11/2013

05/11/2013

2013

Resumo

A set of predictor variables is said to be intrinsically multivariate predictive (IMP) for a target variable if all properly contained subsets of the predictor set are poor predictors of the. target but the full set predicts the target with great accuracy. In a previous article, the main properties of IMP Boolean variables have been analytically described, including the introduction of the IMP score, a metric based on the coefficient of determination (CoD) as a measure of predictiveness with respect to the target variable. It was shown that the IMP score depends on four main properties: logic of connection, predictive power, covariance between predictors and marginal predictor probabilities (biases). This paper extends that work to a broader context, in an attempt to characterize properties of discrete Bayesian networks that contribute to the presence of variables (network nodes) with high IMP scores. We have found that there is a relationship between the IMP score of a node and its territory size, i.e., its position along a pathway with one source: nodes far from the source display larger IMP scores than those closer to the source, and longer pathways display larger maximum IMP scores. This appears to be a consequence of the fact that nodes with small territory have larger probability of having highly covariate predictors, which leads to smaller IMP scores. In addition, a larger number of XOR and NXOR predictive logic relationships has positive influence over the maximum IMP score found in the pathway. This work presents analytical results based on a simple structure network and an analysis involving random networks constructed by computational simulations. Finally, results from a real Bayesian network application are provided. (C) 2012 Elsevier Inc. All rights reserved.

This work was supported by the Fundação de Amparo e Apoio à Pesquisa do Estado de São Paulo (FAPESP), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Microsoft Research, and the U.S. National Science Foundation, through NSF awards CCF-0845407 (Braga-Neto).

Identificador

INFORMATION SCIENCES, NEW YORK, v. 225, pp. 18-34, MAR 10, 2013

0020-0255

http://www.producao.usp.br/handle/BDPI/41051

10.1016/j.ins.2012.10.027

http://dx.doi.org/10.1016/j.ins.2012.10.027

Idioma(s)

eng

Publicador

ELSEVIER SCIENCE INC

NEW YORK

Relação

INFORMATION SCIENCES

Direitos

restrictedAccess

Copyright ELSEVIER SCIENCE INC

Palavras-Chave #BAYESIAN NETWORK #FEATURE SELECTION #INTRINSICALLY MULTIVARIATE PREDICTION #COEFFICIENT #COMPUTER SCIENCE, INFORMATION SYSTEMS
Tipo

article

original article

publishedVersion