Learning to Perform Actions through Multimodal Interaction

Chen Yu, Xue Gu, and Dana H. Ballard

We present a multimodal learning system that is trained in an unsupervised mode in which users perform everyday tasks while providing natural language descriptions of their behaviors. In addition to recognize hand motion types and associate them with action verbs, the system also learns how to re-generate actions based on inverse kinematics. A real-time speech dictation system shows that the virtual human can interact with users and perform actions according to spoken commands.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.