AAAI Publications, Sixth International AAAI Conference on Weblogs and Social Media

Font Size: 
What Kobe Bryant and Britney Spears Have in Common: Mining Wikipedia for Characteristics of Notable Individuals
Pauline Crystal Ng

Last modified: 2012-05-20

Abstract


This paper proposes a statistical methodology for mining Wikipedia to discover characteristics associated with life outcomes. The methodology is demonstrated using first names and childhood environment. By comparing over 35,000 Wikipedia biographies against spatially and temporally matched census data, we show that individuals with rare names are twice as likely to appear in Wikipedia (RR=2.43 for females; RR=2.30 for males). This result is supported by past studies. Furthermore, birth location also plays a role in success: individuals born in New York and California are ~2x more likely to become entertainers, and those born in the South are ~1.5x more likely to become athletes. These results validate the proposed methodology of using Wikipedia to study life outcomes.

Full Text: PDF