Finding the Most Descriptive Substructures in Graphs with Discrete and Numeric Labels


Autoria(s): Davis, Michael; Liu, Weiru; Miller, Paul
Data(s)

01/04/2014

Resumo

Many graph datasets are labelled with discrete and numeric attributes. Most frequent substructure discovery algorithms ignore numeric attributes; in this paper we show how they can be used to improve search performance and discrimination. Our thesis is that the most descriptive substructures are those which are normative both in terms of their structure and in terms of their numeric values. We explore the relationship between graph structure and the distribution of attribute values and propose an outlier-detection step, which is used as a constraint during substructure discovery. By pruning anomalous vertices and edges, more weight is given to the most descriptive substructures. Our method is applicable to multi-dimensional numeric attributes; we outline how it can be extended for high-dimensional data. We support our findings with experiments on transaction graphs and single large graphs from the domains of physical building security and digital forensics, measuring the effect on runtime, memory requirements and coverage of discovered patterns, relative to the unconstrained approach.

Formato

application/pdf

Identificador

http://pure.qub.ac.uk/portal/en/publications/finding-the-most-descriptive-substructures-in-graphs-with-discrete-and-numeric-labels(e727de67-9aef-42a0-b2e1-b4349b1f0653).html

http://dx.doi.org/10.1007/s10844-013-0299-7

http://pure.qub.ac.uk/ws/files/12759429/jiis_post_print.pdf

Idioma(s)

eng

Direitos

info:eu-repo/semantics/openAccess

Fonte

Davis , M , Liu , W & Miller , P 2014 , ' Finding the Most Descriptive Substructures in Graphs with Discrete and Numeric Labels ' Journal of Intelligent Information Systems , vol 42 , no. 2 , pp. 307-332 . DOI: 10.1007/s10844-013-0299-7

Palavras-Chave #Graph mining #Anomaly detection
Tipo

article