Multimodal Prediction and Classification of Audio-Visual Features

Vladimir Pavlovic and Thomas S. Huang

The surge of interest in multimedia and multimodal interfaces has prompted the need for novel estimation, prediction, and classification techniques for data from different but coupled modalities. Unimodal techniques ported to this domain have only exhibited limited success. We propose a new framework for feature estimation, prediction, and classification based on multimodal knowledge-constrained hidden Markov models (HMMs). The classical role of HMMs as statistical classifiers is enhanced by their new role as multimodal feature predictors. Moreover, by fusing the multimodal formulation with higher level knowledge we allow the influence of such knowledge to be reflected in feature prediction and tracking as well as in feature classification.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.