The availability of extensive datasets containing gaze information for each subject has significantly enhanced gaze estimation accuracy. However, the discrepancy between domains severely affects a model's performance explicitly trained for a particular domain. In this paper, we propose the Causal Representation-Based Domain Generalization on Gaze Estimation (CauGE) framework designed based on the general principle of causal mechanisms, which is consistent with the domain difference. We employ an adversarial training manner and an additional penalizing term to extract domain-invariant features. After extracting features, we position the attention layer to make features sufficient for inferring the actual gaze. By leveraging these modules, CauGE ensures that the neural networks learn from representations that meet the causal mechanisms' general principles. By this, CauGE generalizes across domains by extracting domain-invariant features, and spurious correlations cannot influence the model. Our method achieves state-of-the-art performance in the domain generalization on gaze estimation benchmark.
翻译:针对每个受试者包含视线信息的广泛数据集的可用性显著提高了视线估计的准确性。然而,领域间的差异严重影响了为特定领域显式训练的模型性能。本文提出基于因果机制一般原理设计的视线估计因果表征领域泛化框架,该原理与领域差异具有一致性。我们采用对抗训练方式和额外的惩罚项来提取领域不变特征。提取特征后,我们定位注意力层以使特征足以推断实际视线。通过利用这些模块,CauGE确保神经网络从满足因果机制一般原理的表征中学习。由此,CauGE通过提取领域不变特征实现跨领域泛化,且伪相关无法影响模型。我们的方法在视线估计领域泛化基准测试中取得了最先进的性能。