Backdoor attacks embed hidden malicious behaviors in reinforcement learning (RL) policies and activate them using triggers at test time. Most existing attacks are validated only in simulation, while their effectiveness in real-world robotic systems remains unclear. In physical deployment, safety-constrained control pipelines such as velocity limiting, action smoothing, and collision avoidance suppress abnormal actions, causing strong attenuation of conventional backdoor attacks. We study this previously overlooked problem and propose a diffusion-guided backdoor attack framework (DGBA) for real-world RL. We design small printable visual patch triggers placed on the floor and generate them using a conditional diffusion model that produces diverse patch appearances under real-world visual variations. We treat the robot control stack as a black-box system. We further introduce an advantage-based poisoning strategy that injects triggers only at decision-critical training states. We evaluate our method on a TurtleBot3 mobile robot and demonstrate reliable activation of targeted attacks while preserving normal task performance. Demo videos and code are available in the supplementary material.
翻译:后门攻击在强化学习策略中嵌入隐藏的恶意行为,并在测试时使用触发器激活它们。现有攻击大多仅在仿真环境中验证,其在现实世界机器人系统中的有效性尚不明确。在物理部署中,安全约束控制流程(如速度限制、动作平滑和碰撞避免)会抑制异常动作,导致传统后门攻击效果严重衰减。我们研究这一先前被忽视的问题,并提出一种面向现实世界强化学习的扩散引导后门攻击框架。我们设计放置于地面的小型可打印视觉补丁触发器,并利用条件扩散模型生成它们,该模型能在现实世界视觉变化下产生多样化的补丁外观。我们将机器人控制栈视为黑盒系统。进一步引入基于优势的投毒策略,仅在决策关键的训练状态注入触发器。我们在TurtleBot3移动机器人上评估所提方法,结果表明在保持正常任务性能的同时,能够可靠地激活目标攻击。演示视频和代码详见补充材料。