Convolutional neural networks (CNNs) have exhibited state-of-the-art performance in various audio classification tasks. However, their real-time deployment remains a challenge on resource-constrained devices like embedded systems. In this paper, we analyze how the performance of large-scale pretrained audio neural networks designed for audio pattern recognition changes when deployed on a hardware such as Raspberry Pi. We empirically study the role of CPU temperature, microphone quality and audio signal volume on performance. Our experiments reveal that the continuous CPU usage results in an increased temperature that can trigger an automated slowdown mechanism in the Raspberry Pi, impacting inference latency. The quality of a microphone, specifically with affordable devices like the Google AIY Voice Kit, and audio signal volume, all affect the system performance. In the course of our investigation, we encounter substantial complications linked to library compatibility and the unique processor architecture requirements of the Raspberry Pi, making the process less straightforward compared to conventional computers (PCs). Our observations, while presenting challenges, pave the way for future researchers to develop more compact machine learning models, design heat-dissipative hardware, and select appropriate microphones when AI models are deployed for real-time applications on edge devices. All related assets and an interactive demo can be found on GitHub
翻译:卷积神经网络(CNN)在多种音频分类任务中展现了最先进的性能。然而,在嵌入式系统等资源受限设备上的实时部署仍面临挑战。本文分析了为音频模式识别设计的大规模预训练音频神经网络在树莓派等硬件上部署时的性能变化。我们通过实验研究了CPU温度、麦克风质量以及音频信号音量对性能的影响。实验表明:持续CPU使用导致温度升高,可能触发树莓派的自动减速机制,进而影响推理延迟。麦克风质量(特别是谷歌AIY语音套件等廉价设备)和音频信号音量均会影响系统性能。研究过程中,我们遇到了与库兼容性及树莓派独特处理器架构要求相关的重大复杂问题,使得部署过程相比传统计算机(PC)更不直接。尽管存在挑战,我们的观察为未来研究者开发更紧凑的机器学习模型、设计散热硬件以及在边缘设备上部署AI模型进行实时应用时选择合适的麦克风铺平了道路。所有相关资源及交互式演示可在GitHub上获取。