Catherine Baudin, Smadar Kadar, Jody Gevins Underwood, Vinod Baya
Information retrieval systems that use conceptual indexing to describe the information content perform better than syntactic indexing methods based on words from a text. However, since conceptual indices represent the semantics of a piece of information, it is difficult to extract them automatically from a document, and it is tedious to build them manually. We implemented an information retrieval system that acquires conceptual indices of text, graphics and videotaped documents. Our approach is to use an underlying model of the domain covered by the documents to constrain the user’s queries. This facilitates question-based acquisition of conceptual indices: converting user queries into indices which accurately model the content of the documents, and can be reused. We discuss Dedal, a system that facilitates the indexing and retrieval of design documents in the mechanical engineering domain. A user formulates a query to the system, and if there is no corresponding index, Dedal uses the underlying domain model and a set of retrieval heuristics to approximate the retrieval, and ask for confirmation from the user. If the user finds the retrieved information relevant, Dedal acquires a new index based on the query. We demonstrate the relevance and coverage of the acquired indices through experimentation.