Mary Fernandez, Daniela Florescu, Alon Levy and Dan Suciu
Building large Web sites is similar in many ways to building knowledge and database systems. In particular, by providing a declarative, logical view of a Web site’s data and structure, many of a site builder’s tasks, such as creating complex sites, modifying a site’s structure, and creating multiple versions of a site, are simplified significantly. New systems, such aS STRUDEL, support logical views of Web sites by allowing site builders to construct a site declaratively. In this paper, we address an important problem for site builders: verifying that a Web site’s structure conforms to certain constraints. Specifically, we consider the problem of verifying that a Web site created declaratively by STRUDEL satisfies certain integrity constraints, such as 'all pages are reachable from the root' and 'every organization page points to its suborganizations', etc. Our contributions are (1) formulating the verification problem as an entailment problem in a logical setting, and (2) presenting a sound and complete algorithm for verifying large classes of integrity constraints that occur in practice. Our algorithm uses a novel data structure, the site schema, which enables us to identify cases in which the general reasoning problem reduces to a decidable problem.