Neural image classifiers can often learn to make predictions by overly relying on non-predictive features that are spuriously correlated with the class labels in the training data. This leads to poor performance in real-world atypical scenarios where such features are absent. Supplementing the training dataset with images without such spurious features can aid robust learning against spurious correlations via better generalization. This paper presents ASPIRE (Language-guided data Augmentation for SPurIous correlation REmoval), a simple yet effective solution for expanding the training dataset with synthetic images without spurious features. ASPIRE, guided by language, generates these images without requiring any form of additional supervision or existing examples. Precisely, we employ LLMs to first extract foreground and background features from textual descriptions of an image, followed by advanced language-guided image editing to discover the features that are spuriously correlated with the class label. Finally, we personalize a text-to-image generation model to generate diverse in-domain images without spurious features. We demonstrate the effectiveness of ASPIRE on 4 datasets, including the very challenging Hard ImageNet dataset, and 9 baselines and show that ASPIRE improves the classification accuracy of prior methods by 1% - 38%. Code soon at: https://github.com/Sreyan88/ASPIRE.
翻译:神经图像分类器往往过度依赖训练数据中与类别标签虚假相关的非预测特征进行学习预测,这导致在现实世界中缺乏此类特征的非常规场景下性能不佳。通过向训练数据集中补充不包含此类虚假特征的图像,有助于通过更好的泛化实现对抗虚假相关性的鲁棒学习。本文提出ASPIRE(语言引导的数据增强用于去除虚假相关性),这是一种简单而有效的解决方案,通过生成不含虚假特征的合成图像来扩展训练数据集。在语言引导下,ASPIRE无需任何形式的额外监督或现有示例即可生成这些图像。具体而言,我们首先利用大语言模型从图像文本描述中提取前景和背景特征,随后采用先进的语言引导图像编辑技术发现与类别标签虚假相关的特征。最后,我们对文本到图像生成模型进行个性化调整,以生成不含虚假特征的多样化域内图像。我们在4个数据集(包括极具挑战性的Hard ImageNet数据集)及9个基线方法上验证了ASPIRE的有效性,结果表明ASPIRE将先前方法的分类准确率提升了1%至38%。代码即将发布在:https://github.com/Sreyan88/ASPIRE。