Recent models for natural language understanding are inclined to exploit simple patterns in datasets, commonly known as shortcuts. These shortcuts hinge on spurious correlations between labels and latent features existing in the training data. At inference time, shortcut-dependent models are likely to generate erroneous predictions under distribution shifts, particularly when some latent features are no longer correlated with the labels. To avoid this, previous studies have trained models to eliminate the reliance on shortcuts. In this study, we explore a different direction: pessimistically aggregating the predictions of a mixture-of-experts, assuming each expert captures relatively different latent features. The experimental results demonstrate that our post-hoc control over the experts significantly enhances the model's robustness to the distribution shift in shortcuts. Besides, we show that our approach has some practical advantages. We also analyze our model and provide results to support the assumption.
翻译:近年来,自然语言理解模型倾向于利用数据集中存在的简单模式,这些模式通常被称为“捷径”。这些捷径依赖于训练数据中标签与潜在特征之间的虚假相关性。在推理阶段,依赖捷径的模型在分布偏移下很可能产生错误预测,尤其是当某些潜在特征不再与标签相关时。为避免此问题,先前研究通过训练模型以消除对捷径的依赖。本研究探索了一个不同的方向:悲观地聚合混合专家模型的预测结果,假设每个专家捕获了相对不同的潜在特征。实验结果表明,我们对专家模型的后验控制显著提升了模型对捷径分布偏移的鲁棒性。此外,我们证明了该方法具有若干实际优势。我们还分析了模型并提供了支持该假设的结果。