Structured and Sparse Annotations for Image Emotion Distribution Learning
Label distribution learning methods effectively address the label ambiguity problem and have achieved great success in image emotion analysis. However, these methods ignore structured and sparse information naturally contained in the annotations of emotions. For example, emotions can be grouped and ordered due to their polarities and degrees. Meanwhile, emotions have the character of intensity and are reflected in different levels of sparse annotations. Motivated by these observations, we present a convolutional neural network based framework called Structured and Sparse annotations for image emotion Distribution Learning (SSDL) to tackle two challenges. In order to utilize structured annotations, the Earth Mover’s Distance is employed to calculate the minimal cost required to transform one distribution to another for ordered emotions and emotion groups. Combined with Kullback-Leibler divergence, we design the loss to penalize the mispredictions according to the dissimilarities of same emotions and different emotions simultaneously. Moreover, in order to handle sparse annotations, sparse regularization based on emotional intensity is adopted. Through combined loss and sparse regularization, SSDL could effectively leverage structured and sparse annotations for predicting emotion distribution. Experiment results demonstrate that our proposed SSDL significantly outperforms the state-of-the-art methods.