Despite the recent success of deep neural networks, there remains a need for effective methods to enhance domain generalization using vision transformers. In this paper, we propose a novel domain generalization technique called Robust Representation Learning with Self-Distillation (RRLD) comprising i) intermediate-block self-distillation and ii) augmentation-guided self-distillation to improve the generalization capabilities of transformer-based models on unseen domains. This approach enables the network to learn robust and general features that are invariant to different augmentations and domain shifts while effectively mitigating overfitting to source domains. To evaluate the effectiveness of our proposed method, we perform extensive experiments on PACS and OfficeHome benchmark datasets, as well as an industrial wafer semiconductor defect dataset. The results demonstrate that RRLD achieves robust and accurate generalization performance. We observe an average accuracy improvement in the range of 1.2% to 2.3% over the state-of-the-art on the three datasets.
翻译:尽管深度神经网络近期取得了成功,但仍需有效方法利用视觉Transformer提升域泛化能力。本文提出一种新型域泛化技术——基于自蒸馏的鲁棒表示学习(RRLD),该方法包含(i)中间模块自蒸馏和(ii)增强引导自蒸馏两个组成部分,用于提升Transformer基模型在未知域上的泛化能力。该技术使网络能够学习对不同数据增强和域偏移具有不变性的鲁棒通用特征,同时有效缓解对源域的过拟合。为评估所提方法的有效性,我们在PACS和OfficeHome基准数据集以及工业晶圆半导体缺陷数据集上进行了大量实验。结果表明RRLD实现了鲁棒且精确的泛化性能。在三个数据集上,我们观察到相较现有最优方法平均准确率提升1.2%至2.3%。