ChapGTP, ILLC’s Attempt at Raising a BabyLM: Improving Data Efficiency by Automatic Task Formation
| Authors |
|
|---|---|
| Publication date | 2023 |
| Host editors |
|
| Book title | Findings of the BabyLM Challenge: Sample-efficient pretraining on developmentally plausible corpora |
| ISBN (electronic) |
|
| Event | BabyLM Challenge at the 27th Conference on Computational Natural Language Learning |
| Pages (from-to) | 74-85 |
| Publisher | Stroudsburg, PA: Association for Computational Linguistics |
| Organisations |
|
| Abstract | We present the submission of the ILLC at the University of Amsterdam to the BabyLM challenge (Warstadt et al., 2023), in the strict-small track. Our final model, ChapGTP, is a masked language model that was trained for 200 epochs, aided by a novel data augmentation technique called Automatic Task Formation. We discuss in detail the performance of this model on the three evaluation suites: BLiMP, (Super)GLUE, and MSGS. Furthermore, we present a wide range of methods that were ultimately not included in the model, but may serve as inspiration for training LMs in low-resource settings. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.18653/v1/2023.conll-babylm.6 |
| Downloads |
2023.conll-babylm.6
(Final published version)
|
| Permalink to this page | |
