Embodied Artificial Intelligence (AI) has recently attracted significant attention as it bridges AI with the physical world. Modern embodied AI systems often combine a Large Language Model (LLM)-based planner for high-level task planning and a reinforcement learning (RL)-based controller for low-level action generation, enabling embodied agents to tackle complex tasks in real-world environments. However, deploying embodied agents remains challenging due to their high computation requirements, especially for battery-powered local devices. Although techniques like lowering operating voltage can improve energy efficiency, they can introduce bit errors and result in task failures. In this work, we propose CREATE, a general design principle that leverages heterogeneous resilience at different layers for synergistic energy-reliability co-optimization. For the first time, we conduct a comprehensive error injection study on modern embodied AI systems and observe an inherent but heterogeneous fault tolerance. Building upon these insights, we develop an anomaly detection and clearance mechanism at the circuit level to eliminate outlier errors. At the model level, we propose a weight-rotation-enhanced planning algorithm to improve the fault tolerance of the LLM-based planner. Furthermore, we introduce an application-level technique, autonomy-adaptive voltage scaling, to dynamically adjust the operating voltage of the controllers. The voltage scaling circuit is co-designed to enable online voltage adjustment. Extensive experiments demonstrate that without compromising task quality, CREATE achieves 40.6% computational energy savings on average over nominal-voltage baselines and 35.0% over prior-art techniques. This further leads to 29.5% to 37.3% chip-level energy savings and approximately a 15% to 30% improvement in battery life.
翻译:具身人工智能(AI)因其将AI与物理世界连接而近来备受关注。现代具身AI系统通常结合基于大语言模型(LLM)的规划器进行高层任务规划,以及基于强化学习(RL)的控制器进行低层动作生成,使得具身智能体能够在真实世界环境中处理复杂任务。然而,部署具身智能体仍面临挑战,主要源于其高计算需求,特别是对于电池供电的本地设备。虽然降低工作电压等技术可提高能效,但可能引入比特错误并导致任务失败。本研究提出CREATE,一种通用的设计原则,利用不同层的异构弹性进行协同的能量-可靠性协同优化。我们首次对现代具身AI系统进行了全面的错误注入研究,并观察到其内在但异构的容错能力。基于这些发现,我们在电路层开发了异常检测与清除机制以消除异常错误。在模型层,我们提出一种权重旋转增强的规划算法,以提高基于LLM的规划器的容错性。此外,我们引入一种应用层技术——自主适应性电压调节,以动态调整控制器的工作电压。电压调节电路经过协同设计,支持在线电压调整。大量实验表明,在不影响任务质量的前提下,CREATE相比标称电压基线平均节省40.6%的计算能耗,相比现有技术节省35.0%。这进一步带来29.5%至37.3%的芯片级能耗节省,以及约15%至30%的电池续航提升。