Conversations Powered by Cross-Lingual Knowledge
| Authors |
|
|---|---|
| Publication date | 2021 |
| Book title | SIGIR '21 |
| Book subtitle | proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval : July 11-15, 2021, virtual event, Canada |
| ISBN (electronic) |
|
| Event | 44th International ACM SIGIR Conference on Research and Development in Information Retrieval |
| Pages (from-to) | 1442-1451 |
| Publisher | New York, NY: Association for Computing Machinery |
| Organisations |
|
| Abstract |
Today's open-domain conversational agents increase the informativeness of generated responses by leveraging external knowledge. Most of the existing approaches work only for scenarios with a massive amount of monolingual knowledge sources. For languages with limited availability of knowledge sources, it is not effective to use knowledge in the same language to generate informative responses. To address this problem, we propose the task of cross-lingual knowledge grounded conversation (CKGC), where we leverage large-scale knowledge sources in another language to generate informative responses. Two main challenges come with the task of cross-lingual knowledge grounded conversation: (1) knowledge selection and response generation in a cross-lingual setting; and (2) the lack of a test dataset for evaluation. To tackle the first challenge, we propose the curriculum self-knowledge distillation (CSKD) scheme, which utilizes a large-scale dialogue corpus in an auxiliary language to improve cross-lingual knowledge selection and knowledge expression in the target language via knowledge distillation. To tackle the second challenge, we collect a cross-lingual knowledge grounded conversation test dataset to facilitate relevant research in the future. Extensive experiments on the newly created dataset verify the effectiveness of our proposed curriculum self-knowledge distillation method for cross-lingual knowledge grounded conversation. In addition, we find that our proposed unsupervised method significantly outperforms the state-of-the-art baselines in cross-lingual knowledge selection.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1145/3404835.3462883 |
| Permalink to this page | |
