Qiankun Zhao, Prasenjit Mitra, Bi Chen
Recently, social text streams (e.g., blogs, web forums, and emails) have become ubiquitous with the evolution of the web. In some sense, social text streams are sensors of the real world. Often, it is desirable to extract real world events from the social text streams. However, existing event detection research mainly focused only on the stream properties of social text streams but ignored the contextual, temporal, and social information embedded in the streams. In this paper, we propose to detect events from social text streams by exploring the content as well as the temporal, and social dimensions. We define the term event as the information flow between a group of social actors on a specific topic over a certain time period. We represent social text streams as multi-graphs, where each node represents a social actor and each edge represents the information flow between two actors. The content and temporal associations within the flow of information are embedded in the corresponding edge. Events are detected by combining text-based clustering, temporal segmentation, and information flow-based graph cuts of the dual graph of the social networks. Experiments conducted with the Enron email dataset and the political blog dataset from Dailykos show the proposed event detection approach outperforms the other alternatives.
Subjects: 1.10 Information Retrieval; 3.4 Probabilistic Reasoning
Submitted: Apr 24, 2007