The combination of behavioural cloning and neural networks has driven significant progress in robotic manipulation. As these algorithms may require a large number of demonstrations for each task of interest, they remain fundamentally inefficient in complex scenarios, in which finite datasets can hardly cover the state space. One of the remaining challenges is thus out-of-distribution (OOD) generalisation, i.e. the ability to predict correct actions for states with a low likelihood with respect to the state occupancy induced by the dataset. This issue is aggravated when the system to control is treated as a black-box, ignoring its physical properties. This work highlights widespread properties of robotic manipulation, specifically pose equivariance and locality. We investigate the effect of the choice of problem space on OOD performance of BC policies and how transformations arising from characteristic properties of manipulation can be employed for its improvement. Through controlled, simulated and real-world experiments, we empirically demonstrate that these transformations allow behaviour cloning policies, using either standard MLP-based one-step action prediction or diffusion-based action-sequence prediction, to generalise better to certain OOD problem instances. Code is available at https://github.com/kirandoshi/pst_ood_gen.
翻译:行为克隆与神经网络的结合推动了机器人操作领域的显著进展。由于这些算法可能需要对每个感兴趣的任务进行大量演示,因此在复杂场景中它们本质上仍然效率低下,因为有限的数据集难以覆盖整个状态空间。因此,剩余的挑战之一是分布外泛化,即预测在数据集诱导的状态占用分布中似然较低的状态下正确动作的能力。当被控系统被视为黑箱而忽略其物理特性时,这一问题会加剧。本研究重点探讨了机器人操作中普遍存在的特性,特别是姿态等变性和局部性。我们研究了问题空间的选择对行为克隆策略的OOD性能的影响,以及如何利用操作任务的特征属性所产生的变换来改进性能。通过受控的仿真和真实世界实验,我们经验性地证明,这些变换能使行为克隆策略——无论是使用标准的基于MLP的单步动作预测还是基于扩散的动作序列预测——在特定的OOD问题实例上实现更好的泛化。代码可在 https://github.com/kirandoshi/pst_ood_gen 获取。