Multimodal Temporal Fusion Transformers are Good Product Demand Forecasters

M. Sukel; S. Rudinac; M. Worring

doi:https://doi.org/10.1109/MMUL.2024.3373827

Multimodal Temporal Fusion Transformers are Good Product Demand Forecasters

Authors	M. Sukel S. Rudinac M. Worring
Publication date	2024
Journal	IEEE Multimedia
Volume \| Issue number	31 \| 2
Pages (from-to)	48-60
Organisations	Faculty of Economics and Business (FEB) - Amsterdam Business School Research Institute (ABS-RI) Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Multimodal demand forecasting aims at predicting product demand utilizing visual, textual, and contextual information. This article proposes a method for such forecasting using an integrated architecture composed of convolutional, graph-based, and transformer-based networks. Since traditional forecasting methods depend on historical demand and factors like manually generated categorical information, they face challenges such as the cold start problem and handling of category dynamics. To address these challenges, our architecture allows for incorporating multimodal information, such as geographical information, product images, and textual descriptions. Experiments with the multimodal approach are performed on a real-world dataset of more than 50 million data points of article demand. The pipeline presented in this work enhances the reliability of the predictions, demonstrating the potential of leveraging multimodal information in product demand forecasting.
Document type	Article
Language	English
Published at	https://doi.org/10.1109/MMUL.2024.3373827 (Final published version)
Downloads	Multimodal Temporal Fusion Transformers are Good Product Demand Forecasters (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Multimodal Temporal Fusion Transformers are Good Product Demand Forecasters