ZS-NMT-Variations, EC40 Multilingual Machine Translation Dataset/Benchmark
| Creators | |
|---|---|
| Publication date | 2023 |
| Description | EC40 is a Multilingual Neural Machine Translation (MNMT) Training Dataset intended to better understand and study MNMT and Zero-Shot NMT. It contains 66 Million English-Centric Sentences covering 40 Languages (excluding English) across 5 Language Families, sampled from OPUS Corpus. |
| Publisher | GitHub |
| Organisations |
|
| Document type | Dataset |
| Related publication | Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance |
| Other links | https://github.com/Smu-Tan/ZS-NMT-Variations.git |
| Permalink to this page | |