Vision Transformer (ViT) models which were recently introduced by the transformer architecture have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs). However, the high computational requirements of these models limit their practical applicability especially on low-power devices. Current state-of-the-art employs approximate multipliers to address the highly increased compute demands of DNN accelerators but no prior research has explored their use on ViT models. In this work we propose TransAxx, a framework based on the popular PyTorch library that enables fast inherent support for approximate arithmetic to seamlessly evaluate the impact of approximate computing on DNNs such as ViT models. Using TransAxx we analyze the sensitivity of transformer models on the ImageNet dataset to approximate multiplications and perform approximate-aware finetuning to regain accuracy. Furthermore, we propose a methodology to generate approximate accelerators for ViT models. Our approach uses a Monte Carlo Tree Search (MCTS) algorithm to efficiently search the space of possible configurations using a hardware-driven hand-crafted policy. Our evaluation demonstrates the efficacy of our methodology in achieving significant trade-offs between accuracy and power, resulting in substantial gains without compromising on performance.
翻译:近年来,由Transformer架构引入的视觉Transformer(ViT)模型展现出极强的竞争力,并逐渐成为卷积神经网络(CNN)的一种流行替代方案。然而,这些模型的高计算需求限制了其在低功耗设备上的实际应用。当前最先进的技术采用近似乘法器来应对深度神经网络(DNN)加速器急剧增长的计算需求,但尚无研究探讨其在ViT模型中的应用。本文提出TransAxx——一个基于主流PyTorch库的框架,它通过快速内置支持近似算术运算,能够无缝评估近似计算对ViT等DNN模型的影响。利用TransAxx,我们分析了Transformer模型在ImageNet数据集上对近似乘法的敏感性,并通过近似感知微调来恢复精度。此外,我们提出一种为ViT模型生成近似加速器的方法。该方法采用蒙特卡洛树搜索(MCTS)算法,结合硬件驱动的自定义策略高效搜索可能配置空间。实验评估表明,该方法能在精度与功耗之间实现显著权衡,在不牺牲性能的前提下获得可观收益。