Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini
The problem of modeling semi-structured data is important in many application areas such as multimedia data management, biological databases, digital libraries, and data integration. Graph schemas (Buneman et al. 1997) have been proposed recently as a simple and elegant formalism for representing semistructured data. In this model, schemas are represented as graphs whose edges are labeled with unary formulae of a theory, and the notions of conformance of a database to a schema and of subsumption between two schemas are defined in terms of a simulation relation. Several authors have stressed the need of extending graph schemas with various types of constraints, such as edge existence and constraints on the number of outgoing edges. In this paper we analyze the appropriateness of various knowledge representation formalisms for representing and reasoning about graph schemas extended with constraints. We argue that neither First Order Logic, nor Logic Programming nor Frame-based languages are satisfactory for this purpose, and present a solution based on very expressive Description Logics. We provide techniques and complexity analysis for the problem of deciding schema subsumption and conformance in various interesting cases, that differ by the expressive power in the specification of constraints.