Chun-Nan Hsu and Craig A. Knoblock
The query reformulation approach (also called semantic query optimization) takes advantage of the semantic knowledge about the contents of databases for optimization. The basic idea is to use the knowledge to reformulate a query into a less expensive yet equivalent query. Previous work on semantic query optimization has shown the cost reduction that can be achieved by reformulation, we further point out that when applied to distributed multidatabase queries, the reformulation approach can reduce the cost of moving intermediate data from one site to another. However, a robust and efficient method to discover the required knowledge has not yet been developed. This paper presents an example-guided, data-driven learning approach to acquire the knowledge needed in reformulation. We use example queries to guide the learning to capture the database usage pattern. In contrast to the heuristic-driven approach proposed by Siegel, the data-driven approach is more likely to learn the required knowledge for the various reformulation needs of the example queries. Since this learning approach minimizes the dependency on the database structure and implementation, it is applicable to heterogeneous multidatabase systems. With the learning capability, the query reformulation will be more effective and feasible in real-world database applications.