Tackling Attribute Fine-grainedness in Cross-modal Fashion Search with Multi-level Features

Open Access
Authors
Publication date 2021
Book title Proceedings of the 2021 SIGIR Workshop on eCommerce (SIGIR eCom’20)
Book subtitle July 15, 2021, Virtual Event, Montreal, Canada
Event SIGIR 2021 Workshop on eCommerce
Article number workshop paper 3
Number of pages 8
Publisher New York, NY: ACM
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Leveraging information across modalities can facilitate customers throughout their journey, especially in the fashion domain where the visual modality plays an important role. Fashion products have a variety of visual groups of attributes such as shapes, colors, patterns, etc. Every category is fine-grained, i.e., attributes within a category may be visually very similar, e.g., v-neck vs. round-neck. The fine-grainedness of fashion attributes makes cross-modal fashion retrieval more challenging. In this paper, we address the problem of attribute fine-grainedness in fashion cross-modal retrieval by leveraging multi-level feature representations. In particular, we replace the commonly used spatial segmentation approach with a multi-level feature approach. We compare our approach with state-of-the-art models in general and fashion cross-modal retrieval and evaluate it on the Fashion200K and Fashion-Gen datasets. We record a 43.4% relative increase in text-to-image retrieval and a 57.8% relative increase in image-to-text retrieval on the Fashion200K dataset and a 48.6% relative increase in text-to-image retrieval and a 67.2%relative increase in image-to-text retrieval on the Fashion-Gen dataset while reducing the number of model parameters by 70%when compared with the baselines.
Document type Conference contribution
Language English
Published at https://sigir-ecom.github.io/ecom21Papers/paper16.pdf
Other links https://sigir-ecom.github.io/
Downloads
paper16 (Final published version)
Permalink to this page
Back