AAAI Publications, Twenty-Sixth AAAI Conference on Artificial Intelligence

Font Size: 
Predicting Disease Transmission from Geo-Tagged Micro-Blog Data
Adam Sadilek, Henry Kautz, Vincent Silenzio

Last modified: 2012-07-12


Researchers have begun to mine social network data in order to predict a variety of social, economic, and health related phenomena. While previous work has focused on predicting aggregate properties, such as the prevalence of seasonal influenza in a given country, we consider the task of fine-grained prediction of the health of specific people from noisy and incomplete data. We construct a probabilistic model that can predict if and when an individual will fall ill with high precision and good recall on the basis of his social ties and co-locations with other people, as revealed by their Twitter posts. Our model is highly scalable and can be used to predict general dynamic properties of individuals in large real-world social networks. These results provide a foundation for research on fundamental questions of public health, including the identification of non-cooperative disease carriers ("Typhoid Marys"), adaptive vaccination policies, and our understanding of the emergence of global epidemics from day-to-day interpersonal interactions.


machine learning; class imbalance; location-based reasoning; text classification; disease spread; public health

Full Text: PDF