VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events

A. Habibian; T. Mensink; C.G.M. Snoek

doi:https://doi.org/10.1145/2647868.2654913

VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events

Authors	A. Habibian T. Mensink C.G.M. Snoek
Publication date	2014
Book title	MM '14: proceedings of the 2014 ACM Conference on Multimedia: November 3-7, 2014, Orlando, Florida, USA
ISBN	9781450330633
Event	22nd ACM International Conference on Multimedia
Pages (from-to)	17-26
Publisher	New York: ACM
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This paper proposes a new video representation for few-example event recognition and translation. Different from existing representations, which rely on either low-level features, or pre-specified attributes, we propose to learn an embedding from videos and their descriptions. In our embedding, which we call VideoStory, correlated term labels are combined if their combination improves the video classifier prediction. Our proposed algorithm prevents the combination of correlated terms which are visually dissimilar by optimizing a joint-objective balancing descriptiveness and predictability. The algorithm learns from textual descriptions of video content, which we obtain for free from the web by a simple spidering procedure. We use our VideoStory representation for few-example recognition of events on more than 65K challenging web videos from the NIST TRECVID event detection task and the Columbia Consumer Video collection. Our experiments establish that i) VideoStory outperforms an embedding without joint-objective and alternatives without any embedding, ii) The varying quality of input video descriptions from the web is compensated by harvesting more data, iii) VideoStory sets a new state-of-the-art for few-example event recognition, outperforming very recent attribute and low-level motion encodings. What is more, VideoStory translates a previously unseen video to its most likely description from visual content only.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/2647868.2654913
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events