Learning-based control approaches have shown great promise in performing complex tasks directly from high-dimensional perception data for real robotic systems. Nonetheless, the learned controllers can behave unexpectedly if the trajectories of the system divert from the training data distribution, which can compromise safety. In this work, we propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our methodology is inspired by Control Barrier Functions (CBFs), which are model-based tools from the nonlinear control literature that can be used to construct minimally invasive safe policy filters. While existing methods based on CBFs require a known low-dimensional state representation, our proposed approach is directly applicable to systems that rely solely on high-dimensional visual observations by learning in a latent state-space. We demonstrate that our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
翻译:基于学习的控制方法在直接从高维感知数据执行复杂任务方面展现出巨大潜力,尤其适用于真实机器人系统。然而,若系统轨迹偏离训练数据分布,学习得到的控制器可能产生意外行为,从而危及安全性。本文提出了一种控制过滤器,该过滤器可包裹任意参考策略,并有效激励系统保持在基于离线采集的安全演示数据的分布内。我们的方法受控制障碍函数(CBF)启发——这是一种来自非线性控制文献的基于模型的工具,可用于构建最小侵入性的安全策略过滤器。与现有基于CBF的方法需要已知低维状态表示不同,本文提出的方法可直接适用于仅依赖高维视觉观测的系统,其通过在潜在状态空间中学习实现。我们在仿真环境中针对两种不同的视觉运动控制任务(包括俯视和第一人称视角设置)验证了该方法的有效性。