Font Size:

Efficiency Improvements for Parallel Subgraph Miners

Last modified: 2012-05-16

#### Abstract

Algorithms for finding frequent and/or interesting subgraphs in a single large graph scenario are computationally intensive because of the graph isomorphism and the subgraph isomorphism problem. These problems are compounded by the size of most real-world datasets which have sizes in the order of 105 or 106. The SUBDUE algorithm developed by Cook and Holder finds the most compressing subgraph in a large graph. In order to perform the same task on real-world data sets efficiently, Cook et al. developed a parallel approach to SUBDUE called the SP-SUBDUE based on the MPI framework. This paper extends the work done by Cook et al. to improve the efficiency of MPI SUBDUE by modifying the evaluation phase. Our experiments show an improvement in speed-up while retaining the quality of the results of serial SUBDUE. The techniques that we have used in this study can also be used in similar algorithms which use static partitioning of the data and re-evaluation of locally interesting patterns over all the nodes of the cluster.

#### Keywords

parallel graph mining; SUBDUE; highly compressing subgraph

Full Text:
PDF