Jeannette G. Neal, Zuzana Dobes, Keith E. Bettinger, Jong S. Byoun
Multi-modal communication is common among humans. People frequently supplement natural language (NL) communication with simultaneous coordinated pointing gestures and drawing on ancillary visual aids. Similar multi-modal communication can facilitate human interaction with modern sophisticated information processing and decision-aiding computer systems. In this paper, we focus on the use of deictic pointing gestures with simultaneous coordinated NL in both user input and system-generated output. Key knowledge sources and methodology for referent resolution are presented. The synergistic mutual disambiguation of simultaneous NL and pointing is discussed as well as a methodology for handling inconsistent NL/pointing expressions and expressions that have an apparent null referent. This work is part of the Intelligent Multi-Media Interface Project (Neal and Shapiro, 1988) which is devoted to the development of intelligent interface technology that integrates speech, NL text, graphics, and pointing gestures for human-computer dialogues.