Open Gaze: An Open-Source Implementation Replicating Google's Eye Tracking Paper

Eye tracking has been a pivotal tool in diverse fields such as vision research, language analysis, and usability assessment. The majority of prior investigations, however, have concentrated on expansive desktop displays employing specialized, costly eye tracking hardware that lacks scalability. Remarkably little insight exists into ocular movement patterns on smartphones, despite their widespread adoption and significant usage. In this manuscript, we present an open-source implementation of a smartphone-based gaze tracker that emulates the methodology proposed by a GooglePaper (whose source code remains proprietary). Our focus is on attaining accuracy comparable to that attained through the GooglePaper's methodology, without the necessity for supplementary hardware. Through the integration of machine learning techniques, we unveil an accurate eye tracking solution that is native to smartphones. Our approach demonstrates precision akin to the state-of-the-art mobile eye trackers, which are characterized by a cost that is two orders of magnitude higher. Leveraging the vast MIT GazeCapture dataset, which is available through registration on the dataset's website, we successfully replicate crucial findings from previous studies concerning ocular motion behavior in oculomotor tasks and saliency analyses during natural image observation. Furthermore, we emphasize the applicability of smartphone-based gaze tracking in discerning reading comprehension challenges. Our findings exhibit the inherent potential to amplify eye movement research by significant proportions, accommodating participation from thousands of subjects with explicit consent. This scalability not only fosters advancements in vision research, but also extends its benefits to domains such as accessibility enhancement and healthcare applications.

翻译：眼动追踪在视觉研究、语言分析和可用性评估等众多领域一直是一种关键工具。然而，以往的大多数研究都集中在使用专业、昂贵的眼动追踪硬件的大尺寸台式显示器上，这类硬件缺乏可扩展性。尽管智能手机已被广泛采用且使用频繁，但关于其上眼动模式的研究却鲜有进展。在本文中，我们提出一种基于智能手机的眼动追踪器的开源实现，该实现模仿了谷歌论文（其源代码仍为专有）所提出的方法。我们的重点是在无需额外硬件的情况下，达到与谷歌论文方法相当的精度。通过整合机器学习技术，我们揭示了一种原生于智能手机的精准眼动追踪解决方案。我们的方法展现出与最先进的移动眼动追踪器相似的精度，而后者的成本要高两个数量级。利用通过数据集网站注册即可获取的庞大MIT GazeCapture数据集，我们成功复现了以往关于眼动任务中眼动行为以及自然图像观察期间显著性分析的关键研究成果。此外，我们强调了基于智能手机的眼动追踪在识别阅读理解挑战方面的适用性。我们的发现展示了通过让数千名获得明确同意的受试者参与，大规模扩展眼动研究的固有潜力。这种可扩展性不仅促进了视觉研究的进步，还将惠及可访问性增强和医疗保健应用等领域。