Jianhua Ruan, Weixiong Zhang
Identifying intrinsic structures in large networks is a fundamental problem in many fields, such as engineering, social science and biology. In this paper, we are concerned with communities, which are densely connected sub-graphs in a network, and address two critical issues for finding community structures from large experimental data. First, most existing network clustering methods assume sparse networks and networks with strong community structures. In contrast, we consider sparse and dense networks with weak community structures. We introduce a set of simple operations that capture local neighborhood information of a node to identify weak communities. Second, we consider the issue of automatically determining the most appropriate number of communities, a crucial problem for all clustering methods. This requires to properly evaluate the quality of community structures. Built atop a function for network cluster evaluation by Newman and Girvan, we extend their work to weighted graphs. We have evaluated our methods on many networks of known structures, and applied them to analyze a collaboration network and a genetic network. The results showed that our methods can find superb community structures and correct numbers of communities. Comparing to the existing approaches, our methods performed significantly better on networks with weak community structures and equally well on networks with strong community structures.
Subjects: 12. Machine Learning and Discovery; 1.6 Engineering And Science