Naveen Ashish, Craig Knoblock, and Cyrus Shahabi
We present an approach for optimizing the performance of information agents by materializing useful information. A critical problem with information agents, particularly those gathering and integrating information from Web sources is a high query response time. This is because the data needed to answer user queries is present across several different Web sources (and in several pages within a source) and retrieving,extracting and integrating the data is time consuming. We address this problem by materializing useful classes of information and defining them as auxiliary data sources for the information agent. The key challenge here is to identify the content and schema of the classes of information that would be useful to materialize. We present an algorithm that identifies such classes by analyzing patterns in user queries. We describe an implementation of our approach and experiments in progress. We also discuss other important problems that we will address in optimizing information agents.