EULER-ADAS: Energy-Efficient & SIMD-Unified Logarithmic-Posit Engine for Precision-Reconfigurable Approximate ADAS Acceleration

Advanced driver-assistance systems (ADAS) require neural compute engines that deliver low-latency inference under strict power and area constraints. Posit arithmetic is attractive for such accelerators because it provides high numerical fidelity at low precision, but its variable-length regime encoding increases encode/decode cost and exposes the datapath to large regime-field fault effects. This paper presents EULER-ADAS, a SIMD-enabled logarithmic bounded-Posit neural compute engine for energyefficient and reliability-aware ADAS acceleration. The proposed datapath combines bounded-regime Posit representation, stageadaptive logarithmic mantissa multiplication with bit truncation, and a SIMD-shared quire accumulation path supporting Posit- (8,0), Posit-(16,1), and Posit-(32,2) execution. The unified architecture enables 4xPosit-8, 2xPosit-16, or 1xPosit-32 operation without duplicating precision-specific hardware. FPGA implementation shows that the proposed configurations reduce LUT count by up to 41.4%, delay by up to 76.1%, and power by up to 71.9% relative to exact Posit neural compute engines, while achieving up to 10x lower energy-delay product than radix-4 Booth-based Posit multipliers. In 28-nm CMOS, the bounded variants occupy 0.013-0.016 mm2 , consume 19.8-22.1 mW, and operate at up to 1.84 GHz. Application-level evaluation across image-classification, ADAS, and edge-inference workloads shows that the evaluated Posit-16 and Posit-32 configurations remain within about 1.5 percentage points of FP32 accuracy. A TinyYOLOv3 prototype on Pynq-Z2 achieves 78 ms latency at 0.29 W and 22.6 mJ/frame, demonstrating the suitability of EULERADAS for low-power real-time ADAS inference.

翻译：高级驾驶辅助系统（ADAS）需要在严格的功耗与面积约束下实现低延迟推理的神经计算引擎。Posit算术因在低精度下提供高数值保真度而对此类加速器具有吸引力，但其可变长度阶码编码增加了编解码开销，并使数据通路面临较大的阶码域错误影响。本文提出EULER-ADAS——一种支持SIMD的对数有界Posit神经计算引擎，用于实现高能效与高可靠性的ADAS加速。所提数据通路结合了有界阶码Posit表示、带位截断的阶段自适应对数尾数乘法，以及支持Posit-（8,0）、Posit-（16,1）和Posit-（32,2）执行的SIMD共享quire累加路径。该统一架构无需复制精度专用硬件即可实现4倍Posit-8、2倍Posit-16或1倍Posit-32运算。FPGA实现表明，相较于精确Posit神经计算引擎，所提配置将LUT数量最多减少41.4%，延迟最多降低76.1%，功耗最多降低71.9%，同时相比基数为4的Booth型Posit乘法器实现高达10倍的能耗延迟积降低。在28纳米CMOS工艺下，有界变体电路占用面积0.013-0.016 mm²，功耗19.8-22.1 mW，最高工作频率1.84 GHz。在图像分类、ADAS及边缘推理负载中的应用级评估显示，所评估的Posit-16和Posit-32配置与FP32精度误差控制在约1.5个百分点以内。基于Pynq-Z2的TinyYOLOv3原型以0.29 W功耗实现78 ms延迟，每帧能耗22.6 mJ，证明了EULER-ADAS在低功耗实时ADAS推理中的适用性。

相关内容

Engineering

关注 7

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

AlphaMosaic：人工智能赋能的作战管理系统

专知会员服务

46+阅读 · 2025年8月19日

大语言模型在多智能体自动驾驶系统中的应用：近期进展综述

专知会员服务

29+阅读 · 2025年2月25日

《数据驱动的自动驾驶》最新综述，详述大数据系统、数据挖掘和闭环技术

专知会员服务

35+阅读 · 2024年1月28日

自动空中加油《用深度学习技术来估计立体图像中的3D位置》美空军94页论文

专知会员服务

36+阅读 · 2023年6月24日