Uncertainty in gaze estimation manifests in two aspects: 1) low-quality images caused by occlusion, blurriness, inconsistent eye movements, or even non-face images; 2) incorrect labels resulting from the misalignment between the labeled and actual gaze points during the annotation process. Allowing these uncertainties to participate in training hinders the improvement of gaze estimation. To tackle these challenges, in this paper, we propose an effective solution, named Suppressing Uncertainty in Gaze Estimation (SUGE), which introduces a novel triplet-label consistency measurement to estimate and reduce the uncertainties. Specifically, for each training sample, we propose to estimate a novel ``neighboring label'' calculated by a linearly weighted projection from the neighbors to capture the similarity relationship between image features and their corresponding labels, which can be incorporated with the predicted pseudo label and ground-truth label for uncertainty estimation. By modeling such triplet-label consistency, we can measure the qualities of both images and labels, and further largely reduce the negative effects of unqualified images and wrong labels through our designed sample weighting and label correction strategies. Experimental results on the gaze estimation benchmarks indicate that our proposed SUGE achieves state-of-the-art performance.
翻译:视线估计中的不确定性主要体现在两个方面:1)由遮挡、模糊、不一致的眼动甚至非人脸图像导致的低质量图像;2)标注过程中标注点与实际注视点未对齐所产生的错误标签。若允许这些不确定性参与训练,将阻碍视线估计性能的提升。为应对这些挑战,本文提出一种名为"抑制视线估计中的不确定性"(SUGE)的有效解决方案,该方法引入了一种新颖的三标签一致性度量来估计并降低不确定性。具体而言,针对每个训练样本,我们提出通过邻域线性加权投影计算新型"邻近标签",以捕捉图像特征与其对应标签之间的相似性关系,该标签可与预测伪标签及真实标签结合进行不确定性估计。通过建立这种三标签一致性模型,我们能够同时评估图像与标签的质量,并借助设计的样本加权和标签校正策略,大幅降低不合格图像与错误标签的负面影响。在视线估计基准数据集上的实验结果表明,所提出的SUGE方法取得了最先进的性能。