Finding Similar Content within Different Documents

John O. Everett, Daniel G. Bobrow, Cleo Condoravdi, Richard Crouch, Valeria de Paiva, and Reinhard Stolle

We are developing a layered approach to automatically identifying similar document content. The first layer extracts from the text semantically normalized entities (in our case things like parts, e.g., photoreceptor belt) and relevant activities (in our case higher level concept representing domain specific actions, such as cleaning). The set of normalized entities can be used as a signature for identifying tips likely to contain information about the same topic.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.