Recurrent Attention Model for Pedestrian Attribute Recognition
Pedestrian attribute recognition is to predict attribute labels of pedestrian from surveillance images, which is a very challenging task for computer vision due to poor imaging quality and small training dataset. It is observed that many semantic pedestrian attributes to be recognised tend to show spatial locality and semantic correlations by which they can be grouped while previous works mostly ignore this phenomenon. Inspired by Recurrent Neural Network (RNN)’s super capability of learning context correlations and Attention Model’s capability of highlighting the region of interest on feature map, this paper proposes end-to-end Recurrent Convolutional (RC) and Recurrent Attention (RA) models, which are complementary to each other. RC model mines the correlations among different attribute groups with convolutional LSTM unit, while RA model takes advantage of the intra-group spatial locality and inter-group attention correlation to improve the performance of pedestrian attribute recognition. Our RA method combines the Recurrent Learning and Attention Model to highlight the spatial position on feature map and mine the attention correlations among different attribute groups to obtain more precise attention. Extensive empirical evidence shows that our recurrent model frameworks achieve state-of-the-art results, based on pedestrian attribute datasets, i.e. standard PETA and RAP datasets.