Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search

Y. Yuan; C. Siro; M. Aliannejadi; M. de Rijke; W. Lam

doi:https://doi.org/10.48550/arXiv.2402.07742

Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search

Authors	Y. Yuan C. Siro M. Aliannejadi M. de Rijke W. Lam
Publication date	2024
Book title	WWW '24
Book subtitle	Proceedings of the ACM Web Conference 2024 : May 13-17, 2024, Singapore, Singapore
ISBN (electronic)	9798400701719
Event	WWW '24: The ACM Web Conference 2024
Pages (from-to)	1474-1485
Number of pages	12
Publisher	New York, NY: The Association for Computing Machinery
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	In mixed-initiative conversational search systems, clarifying questions aid users who struggle to express their intentions in a single query. These questions aim to uncover user's information needs and resolve query ambiguities. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. Therefore, we propose to add images to clarifying questions and formulate the novel task of asking multimodal clarifying questions in open-domain, mixed-initiative conversational search systems. To facilitate research into this task, we collect a dataset named Melon that contains over 4k multimodal clarifying questions, enriched with over 14k images. We also propose a multimodal query clarification model named Marto and adopt a prompt-based, generative fine-tuning strategy to perform the training of different stages with different prompts. Several analyses are conducted to understand the importance of multimodal contents during the query clarification phase. Experimental results indicate that the addition of images leads to significant improvements of up to 90% in retrieval performance when selecting the relevant images. Extensive analyses are also performed to show the superiority of Marto compared with discriminative baselines.
Document type	Conference contribution
Note	With supplemental videos
Language	English
Published at	https://doi.org/10.48550/arXiv.2402.07742 (Accepted author manuscript) https://doi.org/10.1145/3589334.3645483 (Final published version)
Downloads	2402.07742v1 (Accepted author manuscript) 3589334.3645483 (Final published version)
Supplementary materials	Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search rfp1033
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search