Eric Lambrecht and Subbarao Kambhampati
The most costly aspect of gathering information over the Internet is that of transferring data over the network to answer the user’s query. We make two contributions in this paper that alleviate this problem. First, we present an algorithm for reducing the number of information sources in an information gathering (IG) plan by reasoning with localized closed world (LCW) statements. In contrast to previous work on this problem, our algorithm can handle recursive information gathering plans that arise commonly in practice. Second, we present a method for reducing the amount of network traffic generated while executing an information gathering plan by reordering the sequence in which queries are sent to remote information sources. We will explain why a direct application of traditional distributed database methods to this problem does not work, and present a novel and cheap way of adorning source descriptions to assist in ordering the queries.