Deploying complex Convolutional Neural Networks (CNNs) on FPGA-based accelerators is a promising way forward for safety-critical domains such as aeronautics. In a previous work, we have explored the Versatile Tensor Accelerator (VTA) and showed its suitability for avionic applications. For that, we developed an initial stand-alone compiler designed with certification in mind. However, this compiler still suffers from some limitations that are overcome in this paper. The contributions consist in extending and fully automating the VTA compilation chain to allow complete CNN compilation and support larger CNNs (which parameters do not fit in the on-chip memory). The effectiveness is demonstrated by the successful compilation and simulated execution of a YOLO-NAS object detection model.
翻译:将基于FPGA的加速器上的复杂卷积神经网络部署至航空等安全关键领域,是极具前景的发展方向。此前工作中,我们深入探究了多功能张量加速器并验证其在航空电子应用中的适用性,为此开发了一款以认证为导向的独立初始编译器。然而,该编译器仍存在若干局限性,本文旨在突破这些瓶颈。主要贡献包括:扩展并完全自动化VTA编译链,实现完整卷积神经网络编译,并支持参数规模超出片上存储容量的更大型网络。通过YOLO-NAS目标检测模型的成功编译与仿真执行,充分证实了该方案的有效性。