Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis
As a fundamental task of sentiment analysis, aspect-level sentiment analysis aims to identify the sentiment polarity of a specific aspect in the context. Previous work on aspect-level sentiment analysis is text-based. With the prevalence of multimodal user-generated content (e.g. text and image) on the Internet, multimodal sentiment analysis has attracted increasing research attention in recent years. In the context of aspect-level sentiment analysis, multimodal data are often more important than text-only data, and have various correlations including impacts that aspect brings to text and image as well as the interactions associated with text and image. However, there has not been any related work carried out so far at the intersection of aspect-level and multimodal sentiment analysis. To fill this gap, we are among the first to put forward the new task, aspect based multimodal sentiment analysis, and propose a novel Multi-Interactive Memory Network (MIMN) model for this task. Our model includes two interactive memory networks to supervise the textual and visual information with the given aspect, and learns not only the interactive influences between cross-modality data but also the self influences in single-modality data. We provide a new publicly available multimodal aspect-level sentiment dataset to evaluate our model, and the experimental results demonstrate the effectiveness of our proposed model for this new task.