Eye tracking is becoming an increasingly important task domain in emerging computing platforms such as Augmented/Virtual Reality (AR/VR). Today's eye tracking system suffers from long end-to-end tracking latency and can easily eat up half of the power budget of a mobile VR device. Most existing optimization efforts exclusively focus on the computation pipeline by optimizing the algorithm and/or designing dedicated accelerators while largely ignoring the front-end of any eye tracking pipeline: the image sensor. This paper makes a case for co-designing the imaging system with the computing system. In particular, we propose the notion of "in-sensor sparse sampling", whereby the pixels are drastically downsampled (by 20x) within the sensor. Such in-sensor sampling enhances the overall tracking efficiency by significantly reducing 1) the power consumption of the sensor readout chain and sensor-host communication interfaces, two major power contributors, and 2) the work done on the host, which receives and operates on far fewer pixels. With careful reuse of existing pixel circuitry, our proposed BLISSCAM requires little hardware augmentation to support the in-sensor operations. Our synthesis results show up to 8.2x energy reduction and 1.4x latency reduction over existing eye tracking pipelines.
翻译:眼动追踪正成为增强现实/虚拟现实(AR/VR)等新兴计算平台中日益重要的任务领域。当前的眼动追踪系统存在端到端追踪延迟过长的问题,且其功耗可轻易占据移动VR设备一半的功率预算。现有的大多数优化工作集中于计算流水线,通过改进算法和/或设计专用加速器来提升性能,却基本忽略了眼动追踪流水线的前端环节——图像传感器。本文提出将成像系统与计算系统进行协同设计的理念。具体而言,我们提出"传感器内稀疏采样"概念,即在传感器内部将像素进行大幅度下采样(缩减20倍)。这种传感器内采样通过显著降低以下两方面开销来提升整体追踪效率:1)传感器读出链路与传感器-主机通信接口(两大主要功耗来源)的能耗;2)主机端接收并处理更少像素的工作量。通过精心复用现有像素电路,我们提出的BLISSCAM仅需极少硬件增强即可支持传感器内操作。综合实验结果表明,与现有眼动追踪流水线相比,该方法可实现最高8.2倍的能耗降低和1.4倍的延迟缩减。