Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement

Open Access
Authors
Publication date 2019
Journal Proceedings of Machine Learning Research
Event 36th International Conference on Machine Learning, ICML 2019
Volume | Issue number 97
Pages (from-to) 3499-3508
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample k elements without replacement. We show how to implicitly apply this ’Gumbel-Top-k’ trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in k and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and (deterministic) beam search and can be used as a principled intermediate alternative. In a translation task, the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy.
Document type Article
Note 36th International Conference on Machine Learning (ICML 2019) : Long Beach, California, USA, 9-15 June 2019. . - With supplementary file. - In print proceedings pp. 6133-6145.
Language English
Published at http://proceedings.mlr.press/v97/kool19a.html
Other links https://github.com/wouterkool/stochastic-beam-search http://www.proceedings.com/48979.html
Downloads
kool19a (Final published version)
Supplementary materials
Permalink to this page
Back