Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.
翻译:机器学习算法通常假设训练和测试时的样本独立同分布。大量研究表明,高性能ML分类器可能在分布外(OOD)输入上显著退化,并产生过度自信的错误分类预测。条件语言模型(CLM)主要用于在输出序列中预测下一个词元,由于需通过自回归方式完成多步预测,其在OOD输入上的退化可能更为严重。此外,由于可生成任意文本,潜在低质量输出的空间更大,因此判断何时信任生成输出至关重要。我们提出了一种高精度、轻量级的CLM分布外检测方法,并在抽象式摘要和翻译任务上验证了其有效性。我们还展示了该方法如何在常见的分布漂移场景下用于高质量输出的选择性生成(类似于分类任务中的选择性预测),同时自动拒绝低质量输出,从而提升生成式语言模型的安全部署能力。