Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss
| Authors | |
|---|---|
| Publication date | 2019 |
| Book title | The Web Conference 2019 |
| Book subtitle | proceedings of the World Wide Web Conference WWW 2019 : May 13-17, 2019, San Francisco, CA, USA |
| ISBN (electronic) |
|
| Event | 2019 World Wide Web Conference, WWW 2019 |
| Pages (from-to) | 2879-2885 |
| Publisher | New York: Association for Computing Machinery |
| Organisations |
|
| Abstract |
Sequence-to-Sequence (Seq2Seq) models have achieved encouraging performance on the dialogue response generation task. However, existing Seq2Seq-based response generation methods suffer from a low-diversity problem: they frequently generate generic responses, which make the conversation less interesting. In this paper, we address the low-diversity problem by investigating its connection with model overconfidence reflected in predicted distributions. Specifically, we first analyze the influence of the commonly used Cross- Entropy (CE) loss function, and find that CE prefers high-frequency tokens, which results in low-diversity responses. We propose a Frequency-Aware Cross-Entropy (FACE) loss function that improves over the CE loss by incorporating a weighting mechanism conditioned on token frequency. Extensive experiments on benchmark datasets show that FACE is able to substantially improve the diversity of existing state-of-the-art Seq2Seq response generation methods, in terms of both automatic and human evaluations.
|
| Document type | Conference contribution |
| Note | © 2019 International World Wide Web Conference Committee. |
| Language | English |
| Published at | https://doi.org/10.1145/3308558.3313415 |
| Downloads |
p2879-jiang
(Final published version)
|
| Permalink to this page | |
