Knowledge Representation for Multilingual Text Categorization

Natalia V. Loukachevitch

The described approach to text categorization is based on thematic representation of a text. Thematic representation includes nodes of thematically related terms simulating topics of the text and is provided with classes of their importance for the text. Thematic representation is created on the basis of detailed description of the domain and allows to process different types of texts, to use different systems of categories (in various languages) for text categorization, to adapt quickly the system to other formats and types of texts and/or other systems of categories, to categorize texts using several systems of categories simultaneously. The most part of the algorithm is not language-dependent.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.