xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems

Ternary neural networks (TNNs) offer a superior accuracy-energy trade-off compared to binary neural networks. However, until now, they have required specialized accelerators to realize their efficiency potential, which has hindered widespread adoption. To address this, we present xTern, a lightweight extension of the RISC-V instruction set architecture (ISA) targeted at accelerating TNN inference on general-purpose cores. To complement the ISA extension, we developed a set of optimized kernels leveraging xTern, achieving 67% higher throughput than their 2-bit equivalents. Power consumption is only marginally increased by 5.2%, resulting in an energy efficiency improvement by 57.1%. We demonstrate that the proposed xTern extension, integrated into an octa-core compute cluster, incurs a minimal silicon area overhead of 0.9% with no impact on timing. In end-to-end benchmarks, we demonstrate that xTern enables the deployment of TNNs achieving up to 1.6 percentage points higher CIFAR-10 classification accuracy than 2-bit networks at equal inference latency. Our results show that xTern enables RISC-V-based ultra-low-power edge AI platforms to benefit from the efficiency potential of TNNs.

翻译：三元神经网络（TNNs）相较于二元神经网络在精度与能耗的权衡上具有显著优势。然而，迄今为止，为实现其能效潜力，三元神经网络仍需依赖专用加速器，这阻碍了其广泛应用。为解决这一问题，我们提出了xTern——一种针对通用处理器核心加速TNN推理的轻量级RISC-V指令集架构（ISA）扩展。为配合该ISA扩展，我们开发了一套利用xTern的优化内核，其吞吐量较等效的2位网络实现提升了67%。功耗仅边际增加5.2%，从而实现了57.1%的能效提升。实验表明，所提出的xTern扩展集成至八核计算集群后，仅带来0.9%的极小硅面积开销，且对时序无任何影响。在端到端基准测试中，我们证明xTern能够部署三元神经网络，在同等推理延迟下，其CIFAR-10分类精度最高可比2位网络提升1.6个百分点。我们的研究结果表明，xTern使得基于RISC-V的超低功耗边缘AI平台能够充分利用三元神经网络的能效潜力。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日