CEQE to SQET: A study of contextualized embeddings for query expansion
| Authors |
|
|---|---|
| Publication date | 06-2022 |
| Journal | Information Retrieval Journal |
| Event | EUROPEAN CONFERENCE ON INFORMATION RETRIEVAL (ECIR) 2021 |
| Volume | Issue number | 25 | 2 |
| Pages (from-to) | 184–208 |
| Organisations |
|
| Abstract |
In this work, we study recent advances in context-sensitive language models for the task of query expansion. We study the behavior of existing and new approaches for lexical word-based expansion in both unsupervised and supervised contexts. For unsupervised models, we study the behavior of the Contextualized Embeddings for Query Expansion (CEQE) model. We introduce a new model, Supervised Contextualized Query Expansion with Transformers (SQET) that performs expansion as a supervised classification task and leverages context in pseudo-relevant results. We study the behavior of these expansion approaches for the tasks of ad-hoc document and passage retrieval. We conduct experiments combining expansion with probabilistic retrieval models as well as neural document ranking models. We evaluate expansion effectiveness on three standard TREC collections: Robust, Complex Answer Retrieval, and Deep Learning. We analyze the results of extrinsic retrieval effectiveness, intrinsic ability to rank expansion terms, and perform a qualitative analysis of the differences between the methods. We find out CEQE statically significantly outperforms static embeddings across all three datasets for Recall@1000. Moreover, CEQE outperforms static embedding-based expansion methods on multiple collections (by up to 18% on Robust and 31% on Deep Learning on average precision) and also improves over proven probabilistic pseudo-relevance feedback (PRF) models. SQET outperforms CEQE by 6% in P@20 on the intrinsic term ranking evaluation and is approximately as effective in retrieval performance. Models incorporating neural and CEQE-based expansion score achieves gains of up to 5% in P@20 and 2% in AP on Robust over the state-of-the-art transformer-based re-ranking model, Birch.
|
| Document type | Article |
| Note | In Special Issue on ECIR 2021. |
| Language | English |
| Published at | https://doi.org/10.1007/s10791-022-09405-y |
| Downloads |
s10791-022-09405-y
(Final published version)
|
| Permalink to this page | |