Qualcomm Research and University of Amsterdam at TRECVID 2015: Recognizing Concepts, Objects, and Events in Video
| Authors |
|
|---|---|
| Publication date | 2015 |
| Book title | 2015 TREC Video Retrieval Evaluation: notebook papers and slides |
| Event | TRECVID Workshop 2015 |
| Number of pages | 5 |
| Publisher | Gaithersburg, MD: National Institute of Standards and Technology |
| Organisations |
|
| Abstract |
In this paper we summarize our TRECVID 2015 video recognition experiments. We participated in three tasks: concept detection, object localization, and event recognition, where Qualcomm Research focused on concept detection and object localization and the University of Amsterdam focused on event detection. For concept detection we start from the very deep networks that excelled in the ImageNet 2014 competition and redesign them for the purpose of video recognition, emphasizing on training data augmentation as well as video fine-tuning. Our entry in the localization task is based on classifying a limited number of boxes in each frame using deep learning features. The boxes are proposed by an improved version of selective search. At the core of our multimedia event detection system is an Inception-style deep convolutional neural network that is trained on the full ImageNet hierarchy with 22k categories. We propose several operations that combine and generalize the ImageNet categories to form a desirable set of (super-)categories, while still being able to train a reliable model. The 2015 edition of the TRECVID benchmark has been a fruitful participation for our team, resulting in the best overall result for concept detection, object localization and event detection.
|
| Document type | Conference contribution |
| Language | English |
| Published at | http://www-nlpir.nist.gov/projects/tvpubs/tv15.papers/mediamill.pdf |
| Downloads |
mediamill15
(Final published version)
|
| Permalink to this page | |
