As in school, one teacher to cover all subjects is insufficient to distill equally robust information to a student. Hence, each subject is taught by a highly specialised teacher. Following a similar philosophy, we propose a multiple specialized teacher framework to distill knowledge to a student network. In our approach, directed at face recognition use cases, we train four teachers on one specific ethnicity, leading to four highly specialized and biased teachers. Our strategy learns a project of these four teachers into a common space and distill that information to a student network. Our results highlighted increased performance and reduced bias for all our experiments. In addition, we further show that having biased/specialized teachers is crucial by showing that our approach achieves better results than when knowledge is distilled from four teachers trained on balanced datasets. Our approach represents a step forward to the understanding of the importance of ethnicity-specific features.
翻译:正如在学校中,单一教师难以向学生传授所有科目同等扎实的知识。因此,每门学科应由高度专业化的教师讲授。遵循类似理念,我们提出一种多专家教师框架,将知识蒸馏至学生网络。针对人脸识别应用场景,我们的方法训练四位分别专注于特定人种的教师模型,从而得到四个高度专业化且存在偏见的教师。该策略将这四位教师的特征投影至公共空间,并将该信息蒸馏至学生网络。实验结果表明,所有测试场景的性能均得到提升,同时偏见显著降低。此外,我们进一步证明:相较于从四个基于平衡数据集训练的教师进行知识蒸馏,采用具有偏见/专业化的教师能获得更优结果,这凸显了专业化教师的重要性。本方法为理解人种特异性特征的重要性提供了新的研究视角。