Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference

Open Access
Authors
Publication date 2022
Host editors
  • Y. Goldberg
  • Z. Kozareva
  • Y. Zhang
Book title Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Book subtitle December 7-11, 2022, Abu Dhabi, United Arab Emirates
Event The 2022 Conference on Empirical Methods in Natural Language Processing
Pages (from-to) 10605-10616
Number of pages 12
Publisher Association for Computational Linguistics
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
It has been shown that NLI models are usually biased with respect to the word-overlap between the premise and the hypothesis, as they take this feature as a primary cue for predicting the entailment label. In this paper, we focus on an overlooked aspect of the overlap bias in the NLI models: the reverse word-overlap bias. Our experimental results demonstrate that current NLI systems are also highly biased towards the non-entailment label on instances with low overlap and that existing debiasing methods, which are reportedly successful on challenge datasets, are generally ineffective in addressing this category of bias. Through a set of analyses, we investigate the reasons for the emergence of the overlap bias and the role of minority examples in mitigating this bias. For the former, we find that the word overlap bias does not stem from pre-training, and in the latter, we observe that in contrast to the accepted assumption, eliminating minority examples does not affect the generalizability of debiasing methods with respect to the overlap bias.
Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2022.emnlp-main.725
Downloads
2022.emnlp-main.725 (Final published version)
Permalink to this page
Back