This paper presents J3DAI, a tiny deep neural network-based hardware accelerator for a 3-layer 3D-stacked CMOS image sensor featuring an artificial intelligence (AI) chip integrating a Deep Neural Network (DNN)-based accelerator. The DNN accelerator is designed to efficiently perform neural network tasks such as image classification and segmentation. This paper focuses on the digital system of J3DAI, highlighting its Performance-Power-Area (PPA) characteristics and showcasing advanced edge AI capabilities on a CMOS image sensor. To support hardware, we utilized the Aidge comprehensive software framework, which enables the programming of both the host processor and the DNN accelerator. Aidge supports post-training quantization, significantly reducing memory footprint and computational complexity, making it crucial for deploying models on resource-constrained hardware like J3DAI. Our experimental results demonstrate the versatility and efficiency of this innovative design in the field of edge AI, showcasing its potential to handle both simple and computationally intensive tasks. Future work will focus on further optimizing the architecture and exploring new applications to fully leverage the capabilities of J3DAI. As edge AI continues to grow in importance, innovations like J3DAI will play a crucial role in enabling real-time, low-latency, and energy-efficient AI processing at the edge.
翻译:本文提出J3DAI,一种面向三层三维堆叠CMOS图像传感器的微型深度神经网络硬件加速器,该传感器集成了包含深度神经网络加速器的人工智能芯片。该DNN加速器旨在高效执行图像分类与分割等神经网络任务。本文聚焦于J3DAI的数字系统,重点阐述其性能-功耗-面积特性,并展示其在CMOS图像传感器上实现的前沿边缘AI能力。为支持硬件开发,我们采用Aidge综合软件框架,该框架支持对主机处理器和DNN加速器进行编程。Aidge支持训练后量化,可显著降低内存占用与计算复杂度,这对在J3DAI等资源受限硬件上部署模型至关重要。实验结果表明,该创新设计在边缘AI领域兼具多功能性与高效性,展现了其处理简单任务与计算密集型任务的潜力。未来工作将重点优化架构并探索新应用场景,以充分发挥J3DAI的性能优势。随着边缘AI重要性日益凸显,J3DAI此类创新技术将在实现实时、低延迟、高能效的边缘AI处理中发挥关键作用。