As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. This paper presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman-Morrison-Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in execution time and energy consumption, demonstrating the effectiveness of combining algorithmic and hardware optimizations to enhance learning models for edge computing environments with low-power and real-time requirements.
翻译:随着物联网的发展,在资源受限的设备中嵌入人工智能算法已变得日益重要,以实现不依赖集中式云服务器的实时自主决策。然而,由于计算能力、内存和能源资源有限,在嵌入式设备中实现和执行复杂算法带来了重大挑战。本文提出了算法与硬件技术,以在资源受限的嵌入式设备上高效实现两种LinearUCB上下文赌博算法。基于Sherman-Morrison-Woodbury公式的算法修改简化了模型复杂度,同时利用向量加速来加快矩阵运算。我们分别分析了每项优化的影响,然后将它们结合为双管齐下的策略。结果显示执行时间和能耗均有显著改善,证明了结合算法与硬件优化对于提升低功耗、实时性要求的边缘计算环境学习模型的有效性。