Parameterizable machine learning (ML) accelerators are the product of recent breakthroughs in ML. To fully enable their design space exploration (DSE), we propose a physical-design-driven, learning-based prediction framework for hardware-accelerated deep neural network (DNN) and non-DNN ML algorithms. It adopts a unified approach that combines backend power, performance, and area (PPA) analysis with frontend performance simulation, thereby achieving a realistic estimation of both backend PPA and system metrics such as runtime and energy. In addition, our framework includes a fully automated DSE technique, which optimizes backend and system metrics through an automated search of architectural and backend parameters. Experimental studies show that our approach consistently predicts backend PPA and system metrics with an average 7% or less prediction error for the ASIC implementation of two deep learning accelerator platforms, VTA and VeriGOOD-ML, in both a commercial 12 nm process and a research-oriented 45 nm process.
翻译:可参数化的机器学习加速器是近年来机器学习突破的产物。为充分实现其设计空间探索,我们提出了一种基于物理设计驱动、学习型预测框架,适用于硬件加速的深度神经网络及非深度神经网络机器学习算法。该框架采用统一方法,将后端功耗、性能、面积分析与前端的性能仿真相结合,从而实现对后端PPA及系统指标(如运行时间与能耗)的精准评估。此外,该框架包含全自动化设计空间探索技术,通过自动搜索架构及后端参数优化后端与系统指标。实验研究表明,在面向两个深度学习加速器平台(VTA和VeriGOOD-ML)的专用集成电路实现中,无论是针对商用12纳米工艺还是面向研究的45纳米工艺,该框架对后端PPA与系统指标的预测误差平均不超过7%。