An Empirical Analysis of Machine Translation for Expanding Multilingual Benchmarks

Open Access
Authors
Publication date 2025
Host editors
  • B. Haddow
  • T. Kocmi
  • P. Koehn
  • C. Monz
Book title Tenth Conference on Machine Translation : Proceedings of the Conference
Book subtitle WMT 2025 : November 8-9, 2025
ISBN (electronic)
  • 9798891763418
Event 10th Conference on Machine Translation, WMT 2025
Pages (from-to) 1-30
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
The rapid advancement of large language models (LLMs) has introduced new challenges in their evaluation, particularly for multilingual settings. The limited evaluation data are more pronounced in low-resource languages due to the scarcity of professional annotators, hindering fair progress across languages. In this work, we systematically investigate the viability of using machine translation (MT) as a proxy for evaluation in scenarios where human-annotated test sets are unavailable. Leveraging a state-of-the-art translation model, we translate datasets from four tasks into 198 languages and employ these translations to assess the quality and robustness of MT-based multilingual evaluation under different setups. We analyze task-specific error patterns, identifying when MT-based evaluation is reliable and when it produces misleading results. Our translated benchmark reveals that current language selections in multilingual datasets tend to overestimate LLM performance on low-resource languages. We conclude that although machine translation is not yet a fully reliable method for evaluating multilingual models, overlooking its potential means missing a valuable opportunity to track progress in non-English languages.
Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2025.wmt-1.1
Downloads
2025.wmt-1.1 (Final published version)
Permalink to this page
Back