The reliance of text classifiers on spurious correlations can lead to poor generalization at deployment, raising concerns about their use in safety-critical domains such as healthcare. In this work, we propose to use counterfactual data augmentation, guided by knowledge of the causal structure of the data, to simulate interventions on spurious features and to learn more robust text classifiers. We show that this strategy is appropriate in prediction problems where the label is spuriously correlated with an attribute. Under the assumptions of such problems, we discuss the favorable sample complexity of counterfactual data augmentation, compared to importance re-weighting. Pragmatically, we match examples using auxiliary data, based on diff-in-diff methodology, and use a large language model (LLM) to represent a conditional probability of text. Through extensive experimentation on learning caregiver-invariant predictors of clinical diagnoses from medical narratives and on semi-synthetic data, we demonstrate that our method for simulating interventions improves out-of-distribution (OOD) accuracy compared to baseline invariant learning algorithms.
翻译:文本分类器对虚假相关性的依赖可能导致部署时泛化能力下降,这引发了对其在医疗等安全关键领域应用的担忧。本文提出利用反事实数据增强方法,在数据因果结构知识指导下模拟对虚假特征的干预,从而学习更稳健的文本分类器。研究表明,该策略适用于标签与属性存在虚假相关性的预测问题。在此类问题假设条件下,我们论证了反事实数据增强相比重要性重加权具有更优的样本复杂度。在实践层面,我们基于双重差分法利用辅助数据进行样本匹配,并采用大语言模型表征文本条件概率。通过在医疗病历中预测临床诊断的学习者不变性预测任务以及半合成数据上的广泛实验,我们证明所提出的干预模拟方法相比基线不变学习算法能有效提升域外准确率。