In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms.
翻译:近年来,深度学习领域涌现出许多颠覆性且影响深远的进展。鉴于深度神经网络日益增长的复杂性,设计异构高性能计算平台对高效硬件加速器的需求愈加迫切。深度学习加速器的设计需要多学科交叉的方法,融合计算机体系结构、近似计算、计算模型以及机器学习算法等多个领域的专业知识。目前已有多种方法学和工具被提出用于设计深度学习加速器,包括软硬件协同设计方法、高层次综合技术、专用定制编译器,以及设计空间探索、建模与仿真方法学。这些方法学旨在最大化可开发并行度并最小化数据搬运,以实现高性能与高能效。本综述系统回顾了近年来被提出的最具影响力的设计方法学与EDA工具,用于实现深度学习加速器,为读者提供这一快速演进领域的广阔视角。特别地,本文补充了同一作者团队此前在[203]中提出的综述,后者聚焦于面向异构HPC平台的深度学习硬件加速器。