AAAI Publications, Twenty-First International Joint Conference on Artificial Intelligence

Font Size: 
Combining Speech and Sketch to Interpret Unconstrained Descriptions of Mechanical Devices
David Tyler Bischel, Thomas F. Stahovich, Randall Davis, Aaron Adler, Eric J. Peterson

Last modified: 2009-06-26


Mechanical design tools would be considerably more useful if we could interact with them in the way that human designers communicate design ideas to one another, i.e., using crude sketches and informal speech. Those crude sketches frequently contain pen strokes of two different sorts, one type portraying device structure, the other denoting gestures, such as arrows used to indicate motion. We report here on techniques we developed that use information from both sketch and speech to distinguish gesture strokes from non-gestures -- a critical first step in understanding a sketch of a device. We collected and analyzed unconstrained device descriptions, which revealed six common types of gestures. Guided by this knowledge, we developed a classifier that uses both sketch and speech features to distinguish gesture strokes from non-gestures. Experiments with our techniques indicate that the sketch and speech modalities alone produce equivalent classification accuracy, but combining them produces higher accuracy.


Multimodal Interfaces; Sketch Understanding; Speech Understanding; Classification

Full Text: PDF