Victoria L. Rubin, Jeffrey M. Stanton, and Elizabeth D. Liddy
We present an empirically verified model of discernable emotions, Watson and Tellegen’s Circumplex Theory of Affect from social and personality psychology, and suggest its usefulness in NLP as a potential model for an automation of an eight-fold categorization of emotions in written English texts. We developed a data collection tool based on the model, collected 287 responses from 110 non-expert informants based on 50 emotional excerpts (min=12, max=348, average=86 words), and analyzed the inter-coder agreement per category and per strength of ratings per sub-category. The respondents achieved an average 70.7% agreement in the most commonly identified emotion categories per text. The categories of high positive affect and pleasantness were most common in our data. Within those categories, the affective terms enthusiastic, active, excited, pleased, and satisfied had the most consistent ratings of strength of presence in the texts. The textual clues the respondents chose had comparable length and similar key words. Watson and Tellegen’s model appears to be usable as a guide for development of an NLP algorithm for automated identification of emotion in English texts, and the non-expert informants (with college degree and higher) provided sufficient information for future creation of a gold standard of clues per category.