ARMAN: A Reconfigurable Monolithic 3D Accelerator Architecture for Convolutional Neural Networks

The Convolutional Neural Network (CNN) has emerged as a powerful and versatile tool for artificial intelligence (AI) applications. Conventional computing architectures face challenges in meeting the demanding processing requirements of compute-intensive CNN applications, as they suffer from limited throughput and low utilization. To this end, specialized accelerators have been developed to speed up CNN computations. However, as we demonstrate in this paper via extensive design space exploration, different neural network models have different characteristics, which calls for different accelerator architectures and configurations to match their computing demand. We show that a one-size-fits-all fixed architecture does not guarantee optimal power/energy/performance trade-off. To overcome this challenge, this paper proposes ARMAN, a novel reconfigurable systolic-array-based accelerator architecture based on Monolithic 3D (M3D) technology for CNN inference. The proposed accelerator offers the flexibility to reconfigure among different scale-up or scale-out arrangements depending on the neural network structure, providing the optimal trade-off across power, energy, and performance for various neural network models. We demonstrate the effectiveness of our approach through evaluations of multiple benchmarks. The results demonstrate that the proposed accelerator exhibits up to 2x, 2.24x, 1.48x, and 2x improvements in terms of execution cycles, power, energy, and EDP respectively, over the non-configurable architecture.

翻译：卷积神经网络（CNN）已成为人工智能（AI）应用中强大且通用的工具。传统计算架构在满足计算密集型CNN应用的高处理需求时面临挑战，因其存在吞吐量受限和利用率低的问题。为此，研究人员开发了专用加速器以加速CNN计算。然而，正如本文通过广泛的设计空间探索所证明的，不同神经网络模型具有不同特性，需要匹配不同加速器架构与配置以满足其计算需求。研究表明，单一固定架构无法保证最佳功耗/能效/性能权衡。为克服这一挑战，本文提出ARMAN——一种基于单片3D（M3D）技术的新型可重构脉动阵列加速器架构，专用于CNN推理。该加速器可根据神经网络结构灵活重构为不同缩放或扩展配置，从而为各类神经网络模型提供功耗、能效与性能之间的最优权衡。我们通过多个基准测试验证了该方法的有效性。结果表明，相较于非可配置架构，所提加速器在执行周期、功耗、能耗及能效积（EDP）方面分别实现了高达2倍、2.24倍、1.48倍和2倍的性能提升。

相关内容

Neural Networks

关注 1654

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日