In this paper, we present a survey of deep learning-based methods for the regression of gaze direction vector from head and eye images. We describe in detail numerous published methods with a focus on the input data, architecture of the model, and loss function used to supervise the model. Additionally, we present a list of datasets that can be used to train and evaluate gaze direction regression methods. Furthermore, we noticed that the results reported in the literature are often not comparable one to another due to differences in the validation or even test subsets used. To address this problem, we re-evaluated several methods on the commonly used in-the-wild Gaze360 dataset using the same validation setup. The experimental results show that the latest methods, although claiming state-of-the-art results, significantly underperform compared with some older methods. Finally, we show that the temporal models outperform the static models under static test conditions.
翻译:本文针对从头部与眼部图像回归视线方向向量的深度学习方法进行了系统性综述。我们详细阐述了多种已发表的方法,重点关注其输入数据、模型架构以及用于监督模型的损失函数。此外,我们列举了可用于训练和评估视线方向回归方法的数据集。值得注意的是,由于文献中采用的验证集乃至测试集存在差异,所报告的结果往往缺乏可比性。为解决此问题,我们在广泛使用的野外数据集Gaze360上采用相同的验证设置对多种方法进行了重新评估。实验结果表明,尽管最新方法宣称取得了前沿性能,但与某些早期方法相比仍存在显著差距。最后,我们证实在静态测试条件下,时序模型的表现优于静态模型。