Empirical Study of Dimensional and Categorical Emotion Descriptors in Emotional Speech Perception
Rui Sun, Elliot II Moore

The dynamic between speaker intent and listener perception is played out in the variation of acoustical cues by the speaker that must be interpreted by the listener to determine in an appropriate way. Emotion speech research must rely on either acted intent (i.e., an actor attempting to express an emotion) or listener perception (i.e., listening tests to assign emotional categories to non-acted data) to define ground truth labels for analysis. The emotion labels are described either using emotion dimension or emotion category. This study examines the two emotion characterization strategies dimension and category in communication of emotion embedded in speech as expressed through acted intent and the perception of emotion determined by a group of listeners. The results reveal that, without context information, intended emotion categories could be perceived by listeners with the averaged accuracy rate five times of chance in category. Also, the trend of listener ratings between emotion dimensions (valence/arousal) and emotional word categories was shown to be well correlated. Furthermore, while listeners confused the specific identity of certain emotional expressions, they were generally very accurate at identifying the intended affective space of the actor as determined by intended valence and arousal.

