Automated Population of Cyc: Extracting Information about Named-entities from the Web

Purvesh Shah, David Schneider, Cynthia Matuszek, Robert C. Kahlert, Bjoern Aldag, David Baxter, John Cabral, Michael Witbrock, Jon Curtis

Populating the Cyc Knowledge Base (KB) has been a manual process until very recently. However, there is currently enough knowledge in Cyc for it to be feasible to attempt to acquire additional knowledge autonomously. This paper describes a system that can collect and validate formally represented, fully-integrated knowledge from the Web or any other electronically available text corpus, about various entities of interest (e.g. famous people, organizations, etc.). Experimental results and lessons learned from their analysis are presented.

Subjects: 10. Knowledge Acquisition; 13. Natural Language Processing

Submitted: Feb 13, 2006

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.