Saccades are extremely rapid movements of both eyes that occur simultaneously, typically observed when an individual shifts their focus from one object to another. These movements are among the swiftest produced by humans and possess the potential to achieve velocities greater than that of blinks. The peak angular speed of the eye during a saccade can reach as high as 700{\deg}/s in humans, especially during larger saccades that cover a visual angle of 25{\deg}. Previous research has demonstrated encouraging outcomes in comprehending neurological conditions through the study of saccades. A necessary step in saccade detection involves accurately identifying the precise location of the pupil within the eye, from which additional information such as gaze angles can be inferred. Conventional frame-based cameras often struggle with the high temporal precision necessary for tracking very fast movements, resulting in motion blur and latency issues. Event cameras, on the other hand, offer a promising alternative by recording changes in the visual scene asynchronously and providing high temporal resolution and low latency. By bridging the gap between traditional computer vision and event-based vision, we present events as frames that can be readily utilized by standard deep learning algorithms. This approach harnesses YOLOv8, a state-of-the-art object detection technology, to process these frames for pupil tracking using the publicly accessible Ev-Eye dataset. Experimental results demonstrate the framework's effectiveness, highlighting its potential applications in neuroscience, ophthalmology, and human-computer interaction.
翻译:扫视是双眼同时发生的极快速运动,通常出现在个体将注意力从一个物体转移到另一个物体时。这类运动是人类所能产生的最快速动作之一,其速度甚至可能超过眨眼动作。人类在扫视过程中眼球的峰值角速度可达700°/秒,尤其是在覆盖25°视角的大幅度扫视中。先前研究表明,通过研究扫视运动来理解神经系统疾病已取得令人鼓舞的成果。扫视检测的关键步骤在于精确识别眼球内瞳孔的具体位置,由此可进一步推算出注视角度等附加信息。传统基于帧的相机往往难以满足追踪极快速运动所需的高时间精度要求,导致运动模糊和延迟问题。相比之下,事件相机通过异步记录视觉场景的变化,提供了高时间分辨率和低延迟的解决方案,展现出显著优势。通过弥合传统计算机视觉与事件视觉之间的鸿沟,我们将事件数据转换为标准深度学习算法可直接处理的帧格式。该方法利用最先进的YOLOv8目标检测技术,基于公开可用的Ev-Eye数据集处理这些帧以实现瞳孔追踪。实验结果表明该框架具有显著效能,突显了其在神经科学、眼科学和人机交互领域的潜在应用价值。