Language models often pre-train on large unsupervised text corpora, then fine-tune on additional task-specific data. However, typical fine-tuning schemes do not prioritize the examples that they tune on. We show that, if you can prioritize informative training data, you can achieve better performance while using fewer labels. To do this we augment a language model with an epinet: a small additional network that helps to estimate model uncertainty and forms an \textit{epistemic neural network} (ENN). ENNs are neural networks that can know what they don't know. Using an epinet to prioritize uncertain data, we can fine-tune BERT on GLUE tasks to the same performance while using 2x less data than training without prioritization. We also investigate performance in synthetic neural network generative models designed to build understanding. In each setting, using an epinet outperforms heuristic active learning schemes.
翻译:语言模型通常在大型无监督文本语料库上进行预训练,然后在额外的任务特定数据上进行微调。然而,典型的微调方案并未对用于微调的示例进行优先级排序。我们表明,如果能够优先处理信息量丰富的训练数据,则可以在使用更少标签的情况下获得更好的性能。为此,我们为语言模型增补了一个小型附加网络——epinet,该网络有助于估计模型不确定性并构成认知神经网络(ENN)。ENN是能够识别自身未知信息的神经网络。通过利用epinet优先处理不确定的数据,我们可以在GLUE任务上以无需优先级排序的训练2倍的数据量微调BERT,达到同等性能。我们还研究了旨在构建理解的合成神经网络生成模型中的表现。在每种设置下,使用epinet均优于启发式主动学习方案。