AAAI Publications, Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Font Size: 
Language Models for Semantic Extraction and Filtering in Video Action Recognition
Evelyne Tzoukermann, Jan Neumann, Jana Kosecka, Cornelia Fermuller, Ian Perera, Frank Ferraro, Ben Sapp, Rizwan Chaudhry, Gautam Singh

Last modified: 2011-08-24


The paper addresses the following issues:  (a) how to represent semantic information from natural language so that a vision model can utilize it?  (b) how to extract the salient textual information relevant to vision?  For a given domain, we present a new model of semantic extraction that takes into account word relatedness as well as word disambiguation in order to apply to a vision model. We automatically process the text transcripts and perform syntactic analysis to extract dependency relations. We then perform semantic extraction on the output to filter semantic entities related to actions. The resulting data are used to populate a matrix of co-occurrences utilized by the vision processing modules.  Results show that explicitly modeling the co-occurrence of actions and tools significantly improved performance.

Full Text: PDF