Deep models are susceptible to learning spurious correlations, even during the post-processing. We take a closer look at the knowledge distillation -- a popular post-processing technique for model compression -- and find that distilling with biased training data gives rise to a biased student, even when the teacher is debiased. To address this issue, we propose a simple knowledge distillation algorithm, coined DeTT (Debiasing by Teacher Transplanting). Inspired by a recent observation that the last neural net layer plays an overwhelmingly important role in debiasing, DeTT directly transplants the teacher's last layer to the student. Remaining layers are distilled by matching the feature map outputs of the student and the teacher, where the samples are reweighted to mitigate the dataset bias. Importantly, DeTT does not rely on the availability of extensive annotations on the bias-related attribute, which is typically not available during the post-processing phase. Throughout our experiments, DeTT successfully debiases the student model, consistently outperforming the baselines in terms of the worst-group accuracy.
翻译:深度模型容易学习虚假相关性,即使在后处理阶段也是如此。我们深入研究了知识蒸馏——一种流行的模型压缩后处理技术,发现使用有偏训练数据进行蒸馏会导致学生模型产生偏差,即使教师模型已经去偏。为解决这一问题,我们提出了一种简单的知识蒸馏算法,命名为DeTT(通过移植教师模型实现去偏)。受近期观察到神经网络最后一层在去偏中起决定性作用的启发,DeTT直接移植教师模型的最后一层到学生模型。其余层通过匹配学生和教师模型的特征图输出进行蒸馏,其中样本被重新加权以缓解数据集偏差。重要的是,DeTT不依赖于对偏差相关属性的广泛标注可用性,而这在后处理阶段通常无法获得。在实验中,DeTT成功地对学生模型进行了去偏,在最差分组准确率上持续优于基线方法。