The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i.e., at the extreme-edge. Several critical developments have fueled recent interest in the processing-in-pixel-in-memory paradigm for a wide range of visual machine intelligence tasks, including (1) advances in 3D integration technology to enable complex processing inside each pixel in a 3D integrated manner while maintaining pixel density, (2) analog processing circuit techniques for massively parallel low-energy in-pixel computations, and (3) algorithmic techniques to mitigate non-idealities associated with analog processing through hardware-aware training schemes. This article presents a comprehensive technology-circuit-algorithm landscape that connects technology capabilities, circuit design strategies, and algorithmic optimizations to power, performance, area, bandwidth reduction, and application-level accuracy metrics. We present our results using a comprehensive co-design framework incorporating hardware and algorithmic optimizations for various complex real-life visual intelligence tasks mapped onto our P2M paradigm.
翻译:摄像头传感器产生的海量数据推动了像素阵列内部(即极端边缘端)的数据处理进程。近年来,若干关键进展激发了基于像素内内存内处理范式的广泛视觉机器学习任务研究兴趣,包括:(1)三维集成技术进步,使其能在保持像素密度的同时,以三维集成方式在每个像素内实现复杂处理;(2)模拟处理电路技术,支持高并行度、低能耗的像素内计算;(3)通过硬件感知训练方案缓解模拟处理非理想性的算法技术。本文构建了涵盖工艺能力、电路设计策略与算法优化的工艺-电路-算法全景图谱,系统关联功耗、性能、面积、带宽缩减及应用级精度指标。我们采用融合硬件与算法优化的协同设计框架,展示了针对映射至P2M范式的多种复杂真实视觉智能任务的实验结果。