Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Open Access
Authors
Publication date 2020
Host editors
  • B. Webber
  • T. Cohn
  • Y. He
  • Y. Liu
Book title 2020 Conference on Empirical Methods in Natural Language Processing
Book subtitle EMNLP 2020 : proceedings of the conference : November 16-20, 2020
ISBN (electronic)
  • 9781952148606
Event 2020 Conference on Empirical Methods in Natural Language Processing
Pages (from-to) 2186–2202
Publisher Stroudsburg, PA: The Association for Computational Linguistics
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT – a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases.
Document type Conference contribution
Language English
Published at https://aclanthology.org/2020.emnlp-main.171/
Downloads
2020.emnlp-main.171 (Final published version)
Permalink to this page
Back