Curriculum learning (CL) aims to increase the performance of a learner on a given task by applying a specialized learning strategy. This strategy focuses on either the dataset, the task, or the model. There is little to no work analysing the possibilities to apply CL on the model capacity in natural language processing. To close this gap, we propose the cup curriculum. In a first phase of training we use a variation of iterative magnitude pruning to reduce model capacity. These weights are reintroduced in a second phase, resulting in the model capacity to show a cup-shaped curve over the training iterations. We empirically evaluate different strategies of the cup curriculum and show that it outperforms early stopping reliably while exhibiting a high resilience to overfitting.
翻译:课程学习(CL)旨在通过应用专门的学习策略,提升学习者在特定任务上的表现。该策略专注于数据集、任务或模型。目前鲜有研究分析在自然语言处理中基于模型容量应用课程学习的可能性。为填补这一空白,我们提出杯形课程。在训练的第一阶段,我们使用迭代幅度剪枝的变体来降低模型容量。这些权重在第二阶段被重新引入,使得模型容量在训练迭代过程中呈现杯形曲线。我们通过实验评估了杯形课程的不同策略,并证明其能可靠地超越早停法,同时表现出对过拟合的高度鲁棒性。