Qualcomm Research and University of Amsterdam at TRECVID 2015: Recognizing Concepts, Objects, and Events in Video

C.G.M. Snoek; S. Cappallo; D. Fontijne; D. Julian; D.C. Koelma; P. Mettes; K.E.A. van de Sande; A. Sarah; H. Stokman; R.B. Towal

Qualcomm Research and University of Amsterdam at TRECVID 2015: Recognizing Concepts, Objects, and Events in Video

Authors	C.G.M. Snoek S. Cappallo D. Fontijne D. Julian D.C. Koelma P. Mettes K.E.A. van de Sande A. Sarah H. Stokman R.B. Towal
Publication date	2015
Book title	2015 TREC Video Retrieval Evaluation: notebook papers and slides
Event	TRECVID Workshop 2015
Number of pages	5
Publisher	Gaithersburg, MD: National Institute of Standards and Technology
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	In this paper we summarize our TRECVID 2015 video recognition experiments. We participated in three tasks: concept detection, object localization, and event recognition, where Qualcomm Research focused on concept detection and object localization and the University of Amsterdam focused on event detection. For concept detection we start from the very deep networks that excelled in the ImageNet 2014 competition and redesign them for the purpose of video recognition, emphasizing on training data augmentation as well as video fine-tuning. Our entry in the localization task is based on classifying a limited number of boxes in each frame using deep learning features. The boxes are proposed by an improved version of selective search. At the core of our multimedia event detection system is an Inception-style deep convolutional neural network that is trained on the full ImageNet hierarchy with 22k categories. We propose several operations that combine and generalize the ImageNet categories to form a desirable set of (super-)categories, while still being able to train a reliable model. The 2015 edition of the TRECVID benchmark has been a fruitful participation for our team, resulting in the best overall result for concept detection, object localization and event detection.
Document type	Conference contribution
Language	English
Published at	http://www-nlpir.nist.gov/projects/tvpubs/tv15.papers/mediamill.pdf (Final published version)
Downloads	mediamill15 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Qualcomm Research and University of Amsterdam at TRECVID 2015: Recognizing Concepts, Objects, and Events in Video