Answering Questions about Moving Objects in Surveillance Videos

Boris Katz, Jimmy Lin, Chris Stauffer, and Eric Grimson

Current question answering systems succeed in many respects regarding questions about textual documents. However, information exists in other media, which provides both opportunities and challenges for question answering. We present results in extending question answering capabilities to video footage captured in a surveillance setting. Our prototype system, called Spot, can answer questions about moving objects that appear within the video. We situate this novel application of vision and language technology within a larger framework designed to integrate language and vision systems under a common representation. We believe that our framework will support the next generation of multimodal natural language information access systems.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.