UvA-MT's Participation in the WMT25 General Translation Shared Task

Open Access
Authors
Publication date 2025
Host editors
  • Barry Haddow
  • Tom Kocmi
  • Philipp Koehn
  • Christof Monz
Book title Tenth Conference on Machine Translation : Proceedings of the Conference
Book subtitle WMT 2025 : November 8-9, 2025
ISBN (electronic)
  • 9798891763418
Event 10th Conference on Machine Translation, WMT 2025
Pages (from-to) 688-694
Number of pages 7
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

This paper presents UvA-MT's submission to the WMT 2025 shared task on general machine translation, competing in the unconstrained track across all 16 translation directions. Unusually, this year we use only WMT25's blind test set (source sentences only) to generate synthetic data for LLM training, and translations are produced using pure beam search for submission. Overall, our approach can be seen as a special variant of data distillation, motivated by two key considerations: (1) perfect domain alignment, where the training and test domains are distributionally identical; and (2) the strong teacher model, GPT-4o-mini, offers high-quality outputs as both a reliable reference and a fallback in case of mere memorization. Interestingly, the outputs of the resulting model, trained on Gemma3-12B using Best-of-N (BoN) outputs from GPT-4o-mini, outperform both original BoN outputs from GPT-4omini and Gemma3-12B in some high-resource languages across various metrics. We attribute this to a successful model ensemble, where the student model (Gemma3-12B) retains the strengths of the teacher (GPT-4o-mini) while implicitly avoiding its flaws.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2025.wmt-1.45
Other links https://www.scopus.com/pages/publications/105028885833
Downloads
2025.wmt-1.45 (Final published version)
Permalink to this page
Back