Using Statistical Techniques and WordNet to Reason with Noisy Data

Rakesh Gupta and Mykel J. Kochenderfer

We collected data from non-experts over the web to create a common sense knowledge base for indoor home and office environments. In this paper, we discuss how we use statistical data dimension reduction and clustering techniques to determine consensus in the knowledge base. We explain the use of Latent Semantic Indexing in finding consensus. These statistical techniques make our system robust to noisy data in the knowledge base. Our work contrasts with traditional AI systems which are typically brittle as well as difficult to extend due to handcrafted pieces of knowledge in their knowledge bases. We then discuss how the WordNet hypernym hierarchy is used to generalize knowledge and perform inference about objects not in the knowledge base. WordNet also makes the reasoning system robust to vocabulary differences among people by using synonyms.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.