In the past few years, more and more AI applications have been applied to edge devices. However, models trained by data scientists with machine learning frameworks, such as PyTorch or TensorFlow, can not be seamlessly executed on edge. In this paper, we develop an end-to-end code generator parsing a pre-trained model to C source libraries for the backend using MicroTVM, a machine learning compiler framework extension addressing inference on bare metal devices. An analysis shows that specific compute-intensive operators can be easily offloaded to the dedicated accelerator with a Universal Modular Accelerator (UMA) interface, while others are processed in the CPU cores. By using the automatically generated ahead-of-time C runtime, we conduct a hand gesture recognition experiment on an ARM Cortex M4F core.
翻译:近年来,越来越多的人工智能应用被部署到边缘设备上。然而,数据科学家使用PyTorch或TensorFlow等机器学习框架训练的模型无法在边缘设备上无缝执行。本文通过MicroTVM(一种针对裸机设备推理的机器学习编译器框架扩展)开发了一个端到端代码生成器,将预训练模型解析为后端的C源代码库。分析表明,特定计算密集型算子可通过通用模块化加速器(UMA)接口轻松卸载至专用加速器,而其余算子则在CPU核心中处理。通过使用自动生成的预运行时C运行时,我们在ARM Cortex M4F核心上进行了手势识别实验。