Discovering Underlying Similarities in Video

Arman M Garakani

The concept of interrogating similarities within a data set has a long history in fields ranging from medicinal chemistry to image analysis. We define a descriptor as an entropic measure of similarity for an image and the neighborhood of images surrounding it. Differential changes in descriptor values imply differential changes in the structure underlying the images. For example, at the location of a zero crossing in the descriptor values, the corresponding image is a watershed image clearly sitting between two dissimilar groups of images.

This paper describes a fast algorithm for image sequence clustering based on the above concept. Developed initially for an adaptive system for capture, analysis and storage of lengthy dynamic visual processes, the algorithm uncovers underlying spatio-temporal structures without a priori information or segmentation9.

The algorithm estimates the average amount of information each image, in an ordered set, conveys about the structure underlying the ordered set. Such signatures enable capture of relevant and salient time periods directly leading to reductions in cost of followup analysis and storage. As a part of the video capture system, the above characterization may provide predictive feedback to an adaptive capture sub-system controlling temporal sampling, frame-rate and exposure. Details of the algorithm, examples of its application to quantification of biological motion, and video identification and recognition are presented.

Prior to the workshop, an efficient implementation will be posted as a web service to generate characterization of unknown videos online.

Submitted: Sep 4, 2008