We present a deep learning method for accurately localizing the center of a single corneal reflection (CR) in an eye image. Unlike previous approaches, we use a convolutional neural network (CNN) that was trained solely using simulated data. Using only simulated data has the benefit of completely sidestepping the time-consuming process of manual annotation that is required for supervised training on real eye images. To systematically evaluate the accuracy of our method, we first tested it on images with simulated CRs placed on different backgrounds and embedded in varying levels of noise. Second, we tested the method on high-quality videos captured from real eyes. Our method outperformed state-of-the-art algorithmic methods on real eye images with a 35% reduction in terms of spatial precision, and performed on par with state-of-the-art on simulated images in terms of spatial accuracy.We conclude that our method provides a precise method for CR center localization and provides a solution to the data availability problem which is one of the important common roadblocks in the development of deep learning models for gaze estimation. Due to the superior CR center localization and ease of application, our method has the potential to improve the accuracy and precision of CR-based eye trackers
翻译:我们提出了一种深度学习方法来精确定位眼图像中单个角膜反射(CR)的中心。与现有方法不同,我们采用了仅使用仿真数据训练的卷积神经网络(CNN)。仅使用仿真数据的优势在于完全规避了真实眼图像监督训练所需的手动标注这一耗时过程。为系统评估方法精度,我们首先将其应用于不同背景和不同程度噪声干扰的仿真CR图像测试集;其次在真实人眼采集的高质量视频数据上进行验证。在真实人眼图像上,本方法相较于现有最优算法在空间精度上实现了35%的提升,在仿真图像上的空间准确度则与最先进方法持平。研究表明,本方法提供了精准的CR中心定位方案,有效解决了数据可及性问题——这是制约视线估计深度学习模型开发的重要共性障碍之一。凭借优越的CR中心定位能力和便捷的应用特性,本方法有望提升基于角膜反射的眼动追踪系统的准确性与精度。