Scott Nowson, Jon Oberlander
Work has recently been completed on a PhD Thesis concerning individual difference in the language of personal weblogs (Nowson, 2005). This paper highlights some of the results. Blogs are increasingly used as a resource for academic study, as evidenced by this symposium. Bloggers are not, however, representative of the population as a whole: they are more likely to be teenage or 20-something females, and appear to be highly Open to Experience. Following our linguistic analysis of personal blog entries, we are constructing a feature set for the automatic detection of gender: a small amount of n-gram context proves best at accounting for variance when compared to dictionary-based analysis.