Comprehensible Video Thumbnails

Abstract

We present the Comprehensible Video Thumbnail; an automatically generated visual précis that summarizes salient objects and their dynamics within a video clip. Salient moving objects are detected within clips using a novel stochastic sampling technique that identifies, clusters and then tracks regions exhibiting affine motion coherence within the clip. Tracks are analyzed to determine salient instants at which motion and/or appearance changes significantly, and the resulting objects arranged in a stylized composition optimized to reduce visual clutter and enhance understanding of scene content through classification and depiction of motion type and trajectory. The result is an object-level visual gist of the clip, obtained with full automation and depicting content and motion with greater descriptive power that prior approaches. We demonstrate these benefits through a user study in which the comprehension of our video thumbnails is compared to the state of the art over a wide variety of sports footage.