Adam Anthony, Marie desJardins
Clustering social information is challenging when both attributes and relations are present. Many approaches commonly used today ignore one aspect of the data or the other. Relation-only algorithms are typically limited because the sparseness of relations in social information makes finding strong patterns difficult. Feature-only algorithms have the limitation that the most interesting (and useful) aspect of social information is the set of relations between objects. Recently, there has been a surge in interest in simultaneously considering both features and relations in the operation of a clustering algorithm. Several of these approaches are based on a generative model, which corresponds to an assumption that the data exists as part of some unobserved probability distribution. Each of the approaches discussed in this paper have good initial results. After discussing each in turn, we discuss in broad terms what has been accomplished thus far with generative models, what open problems remain, and how the development of generative models for relational data can contribute to the field of social information processing.
Subjects: 12. Machine Learning and Discovery; 12. Machine Learning and Discovery
Submitted: Jan 23, 2008