AAAI Publications, Fifth International AAAI Conference on Weblogs and Social Media

Font Size: 
Hierarchical Bayesian Models for Latent Attribute Detection in Social Media
Delip Rao, Michael Paul, Clay Fink, David Yarowsky, Timothy Oates, Glen Coppersmith

Last modified: 2011-07-05


We present several novel minimally-supervised models for detecting latent attributes of social media users, with a focus on ethnicity and gender. Previouswork on ethnicity detection has used coarse-grained widely separated classes of ethnicity and assumed the existence of large amounts of training data such as the US census, simplifying the problem. Instead, we examine content generated by users in addition to name morpho-phonemics to detect ethnicity and gender. Further, weaddress this problem in a challenging setting where the ethnicity classes are more fine grained -- ethnicity classes in Nigeria -- and with very limited training data.

Full Text: PDF