Topic Segmentation Algorithms for Text Summarization and Passage Retrieval: An Exhaustive Evaluation

Gael Dias, Elsa Alves, Jose Gabriel Pereira Lopes

In order to solve problems of reliability of systems based on lexical repetition and problems of adaptability of language-dependent systems, we present a context-based topic segmentation system based on a new informative similarity measure based on word co-occurrence. In particular, our evaluation with the state-of-the-art in the domain i.e. the c99 and the TextTiling algorithms shows improved results both with and without the identification of multiword units.

Subjects: 1.10 Information Retrieval; 13. Natural Language Processing

Submitted: Apr 24, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.