EyeCoD: Eye Tracking System Acceleration via FlatCam-based Algorithm & Accelerator Co-Design

Haoran You,Cheng Wan,Yang Zhao,Zhongzhi Yu,Yonggan Fu,Jiayi Yuan,Shang Wu,Shunyao Zhang,Yongan Zhang,Chaojian Li,Vivek Boominathan,Ashok Veeraraghavan,Ziyun Li,Yingyan Lin

from arxiv, Accepted by ISCA 2022; Also selected as an IEEE Micro's Top Pick of 2023

Eye tracking has become an essential human-machine interaction modality for providing immersive experience in numerous virtual and augmented reality (VR/AR) applications desiring high throughput (e.g., 240 FPS), small-form, and enhanced visual privacy. However, existing eye tracking systems are still limited by their: (1) large form-factor largely due to the adopted bulky lens-based cameras; and (2) high communication cost required between the camera and backend processor, thus prohibiting their more extensive applications. To this end, we propose a lensless FlatCam-based eye tracking algorithm and accelerator co-design framework dubbed EyeCoD to enable eye tracking systems with a much reduced form-factor and boosted system efficiency without sacrificing the tracking accuracy, paving the way for next-generation eye tracking solutions. On the system level, we advocate the use of lensless FlatCams to facilitate the small form-factor need in mobile eye tracking systems. On the algorithm level, EyeCoD integrates a predict-then-focus pipeline that first predicts the region-of-interest (ROI) via segmentation and then only focuses on the ROI parts to estimate gaze directions, greatly reducing redundant computations and data movements. On the hardware level, we further develop a dedicated accelerator that (1) integrates a novel workload orchestration between the aforementioned segmentation and gaze estimation models, (2) leverages intra-channel reuse opportunities for depth-wise layers, and (3) utilizes input feature-wise partition to save activation memory size. On-silicon measurement validates that our EyeCoD consistently reduces both the communication and computation costs, leading to an overall system speedup of 10.95x, 3.21x, and 12.85x over CPUs, GPUs, and a prior-art eye tracking processor called CIS-GEP, respectively, while maintaining the tracking accuracy.

翻译：眼动追踪已成为众多虚拟现实和增强现实（VR/AR）应用中提供沉浸式体验的关键人机交互方式，这些应用要求高吞吐量（例如每秒240帧）、小体积和增强的视觉隐私。然而，现有眼动追踪系统仍受限于以下因素：（1）因采用基于透镜的相机而导致的较大体积；（2）相机与后端处理器之间所需的高通信成本，从而限制了其更广泛的应用。为此，我们提出了一种名为EyeCoD的无透镜FlatCam眼动追踪算法与加速器协同设计框架，该框架可在不牺牲追踪精度的前提下，显著减小眼动追踪系统的体积并提升系统效率，为下一代眼动追踪解决方案铺平道路。在系统层面，我们倡导使用无透镜FlatCam以满足移动眼动追踪系统对小体积的需求。在算法层面，EyeCoD集成了一个“先预测后聚焦”流程，首先通过分割预测感兴趣区域（ROI），然后仅聚焦于ROI部分来估计注视方向，从而大幅减少冗余计算和数据移动。在硬件层面，我们进一步开发了一个专用加速器，该加速器（1）整合了上述分割模型和注视估计模型之间的新颖工作负载编排，（2）利用了深度可分离层内部通道重用机会，（3）通过输入特征分区来减少激活内存占用。芯片实测验证表明，我们的EyeCoD在保持追踪精度的同时，一致降低了通信和计算成本，使得整体系统速度相比CPU、GPU以及现有眼动追踪处理器CIS-GEP分别提升了10.95倍、3.21倍和12.85倍。