Context-Infused Visual Grounding for Art

Open Access
Authors
Publication date 2025
Host editors
  • Alessio Del Bue
  • Cristian Canton
  • Jordi Pont-Tuset
  • Tatiana Tommasi
Book title Computer Vision – ECCV 2024 Workshops
Book subtitle Milan, Italy, September 29–October 4, 2024 : proceedings
ISBN
  • 9783031915710
ISBN (electronic)
  • 9783031915727
Series Lecture Notes in Computer Science
Event Workshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Volume | Issue number VI
Pages (from-to) 118-136
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Many artwork collections contain textual attributes that provide rich and contextualised descriptions of artworks. Visual grounding offers the potential for localising subjects within these descriptions on images, however, existing approaches are trained on natural images and generalise poorly to art. In this paper, we present CIGAr (Context-Infused GroundingDINO for Art), a visual grounding approach which utilises the artwork descriptions during training as context, thereby enabling visual grounding on art. In addition, we present a new dataset, Ukiyo-eVG, with manually annotated phrase-grounding annotations, and we set a new state-of-the-art for object detection on two artwork datasets.
Document type Conference contribution
Note With supplementary file
Language English
Published at https://doi.org/10.1007/978-3-031-91572-7_8
Other links https://www.scopus.com/pages/publications/105006904027
Downloads
Context-Infused Visual Grounding for Art (Final published version)
Supplementary materials
Permalink to this page
Back