From HTML to Usable Data: Problems in Heterogeneous Knowledge Source Integration

Terrence Harvey

In this paper we will focus on one aspect of the value-driven process, taking raw information from sites and converting it into a form usable by our decision model1. The decision model we use is an influence diagram which uses information passed to it from an extraction engine to instantiate nodes. At the present time we rely on hand coding extraction algorithms that convert web sites into a list of feature/value tupies. For our prototype system this approach works well, and we have received good results after testing the system in the domain of making a decision about purchasing a digital camera. Numerous other groups are developing much more open-ended extraction engines (Doorenbos, Etzioni,and Weld 1997; Ashish Knoblock 1997a; 1997b; Konopnicki and Shmueli 1995; Genesereth, Keller, and Mueller 1996). Figure 1 shows the influence diagram use by VDIG to evaluate digital cameras.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.