Context-Infused Visual Grounding for Art

Selina Khan; Nanne van Noord

doi:https://doi.org/10.1007/978-3-031-91572-7_8

Context-Infused Visual Grounding for Art

Authors	Selina Khan Nanne van Noord
Publication date	2025
Host editors	Alessio Del Bue Cristian Canton Jordi Pont-Tuset Tatiana Tommasi
Book title	Computer Vision – ECCV 2024 Workshops
Book subtitle	Milan, Italy, September 29–October 4, 2024 : proceedings
ISBN	9783031915710
ISBN (electronic)	9783031915727
Series	Lecture Notes in Computer Science
Event	Workshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Volume \| Issue number	VI
Pages (from-to)	118-136
Publisher	Cham: Springer
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Many artwork collections contain textual attributes that provide rich and contextualised descriptions of artworks. Visual grounding offers the potential for localising subjects within these descriptions on images, however, existing approaches are trained on natural images and generalise poorly to art. In this paper, we present CIGAr (Context-Infused GroundingDINO for Art), a visual grounding approach which utilises the artwork descriptions during training as context, thereby enabling visual grounding on art. In addition, we present a new dataset, Ukiyo-eVG, with manually annotated phrase-grounding annotations, and we set a new state-of-the-art for object detection on two artwork datasets.
Document type	Conference contribution
Note	With supplementary file
Language	English
Published at	https://doi.org/10.1007/978-3-031-91572-7_8 (Final published version)
Other links	https://www.scopus.com/pages/publications/105006904027
Downloads	Context-Infused Visual Grounding for Art (Final published version)
Supplementary materials	632436_1_En_8_MOESM1_ESM
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Context-Infused Visual Grounding for Art