We present the submission of the ILLC at the University of Amsterdam to the BabyLM challenge (Warstadt et al., 2023), in the strict-small track. Our final model, ChapGTP, is a masked language model that was trained for 200 epochs, aided by a novel data augmentation technique called Automatic Task Formation. We discuss in detail the performance of this model on the three evaluation suites: BLiMP, (Super)GLUE, and MSGS. Furthermore, we present a wide range of methods that were ultimately not included in the model, but may serve as inspiration for training LMs in low-resource settings.
翻译:我们提交了阿姆斯特丹大学ILLC在BabyLM挑战(Warstadt等人,2023)严格小规模赛道中的研究成果。最终模型ChapGTP是一种掩码语言模型,通过名为“自动任务构建”的新型数据增强技术,训练了200个epoch。我们详细讨论了该模型在三个评估套件(BLiMP、(Super)GLUE和MSGS)上的表现。此外,我们介绍了最终未纳入模型但可能为低资源场景下训练语言模型提供启发的多种方法。