Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment. Safe Robot RL (SRRL) is a crucial step towards achieving human-robot coexistence. In this paper, we envision a human-centered SRRL framework consisting of three stages: safe exploration, safety value alignment, and safe collaboration. We examine the research gaps in these areas and propose to leverage interactive behaviors for SRRL. Interactive behaviors enable bi-directional information transfer between humans and robots, such as conversational robot ChatGPT. We argue that interactive behaviors need further attention from the SRRL community. We discuss four open challenges related to the robustness, efficiency, transparency, and adaptability of SRRL with interactive behaviors.
翻译:在现实世界中部署用于机器人应用的强化学习算法,需要确保机器人与环境的安全性。安全机器人强化学习(Safe Robot RL, SRRL)是实现人机共存的关键步骤。本文提出了一种人本化的SRRL框架,该框架包含三个阶段:安全探索、安全价值对齐以及安全协作。我们审视了这些领域的研究空白,并建议利用交互行为来推动SRRL发展。交互行为能够实现人与机器人之间的双向信息传递,例如对话式机器人ChatGPT。我们认为,SRRL领域需要进一步关注交互行为。我们讨论了SRRL与交互行为相关的四个开放性挑战,涉及鲁棒性、效率、透明性和适应性。