Generating Character Descriptions for Automatic Summarization of Fiction
Summaries of fictional stories allow readers to quickly decide whether or not a story catches their interest. A major challenge in automatic summarization of fiction is the lack of standardized evaluation methodology or high-quality datasets for experimentation. In this work, we take a bottomup approach to this problem by assuming that story authors are uniquely qualified to inform such decisions. We collect a dataset of one million fiction stories with accompanying author-written summaries from Wattpad, an online story sharing platform. We identify commonly occurring summary components, of which a description of the main characters is the most frequent, and elicit descriptions of main characters directly from the authors for a sample of the stories. We propose two approaches to generate character descriptions, one based on ranking attributes found in the story text, the other based on classifying into a list of pre-defined attributes. We find that the classification-based approach performs the best in predicting character descriptions.