Interactive Multimodal Learning on 100 Million Images

J. Zahálka; S. Rudinac; B.Þ. Jónsson; D.C. Koelma; M. Worring

doi:https://doi.org/10.1145/2911996.2912062

Interactive Multimodal Learning on 100 Million Images

Authors	J. Zahálka S. Rudinac B.Þ. Jónsson D.C. Koelma M. Worring
Publication date	2016
Book title	ICMR'16
Book subtitle	proceedings of the 2016 ACM International Conference on Multimedia Retrieval: June 6-9, 2016, New York, NY, USA
ISBN (electronic)	9781450343596
Event	ACM International Conference on Multimedia Retrieval 2016
Pages (from-to)	333-337
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This paper presents Blackthorn, an efficient interactive multimodal learning approach facilitating analysis of multimedia collections of 100 million items on a single high-end workstation. This is achieved by efficient data compression and optimizations to the interactive learning process. The compressed i-I64 data representation costs tens of bytes per item yet preserves most of the visual and textual semantic information. The optimized interactive learning model scores the i-I64-compressed data directly, greatly reducing the computational requirements. The experiments show that Blackthorn is up to 105x faster than the conventional relevance feedback baseline. Blackthorn is shown to vastly outperform the baseline with respect to recall over time. Blackthorn reaches up to 92% of the precision achieved by the baseline, validating the efficacy of the i-I64 representation. On the YFCC100M dataset, Blackthorn performes one complete interaction round in 0.7 seconds. Blackthorn thus opens multimedia collections comprising 100 million items to learning-based analysis in fully interactive time.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/2911996.2912062
Other links	https://ivi.fnwi.uva.nl/isis/publications/2016/ZahalkaICMR2016
Downloads	p333-zahalka (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Interactive Multimodal Learning on 100 Million Images