Robots are essential in industrial manufacturing due to their reliability and efficiency. They excel in performing simple and repetitive unimanual tasks but still face challenges with bimanual manipulation. This difficulty arises from the complexities of coordinating dual arms and handling multi-stage processes. Recent integration of generative models into imitation learning (IL) has made progress in tackling specific challenges. However, few approaches explicitly consider the multi-stage nature of bimanual tasks while also emphasizing the importance of inference speed. In multi-stage tasks, failures or delays at any stage can cascade over time, impacting the success and efficiency of subsequent sub-stages and ultimately hindering overall task performance. In this paper, we propose a novel keypose-conditioned coordination-aware consistency policy tailored for bimanual manipulation. Our framework instantiates hierarchical imitation learning with a high-level keypose predictor and a low-level trajectory generator. The predicted keyposes serve as sub-goals for trajectory generation, indicating targets for individual sub-stages. The trajectory generator is formulated as a consistency model, generating action sequences based on historical observations and predicted keyposes in a single inference step. In particular, we devise an innovative approach for identifying bimanual keyposes, considering both robot-centric action features and task-centric operation styles. Simulation and real-world experiments illustrate that our approach significantly outperforms baseline methods in terms of success rates and operational efficiency. Implementation codes can be found at https://github.com/JoanaHXU/BiKC-plus.
翻译:机器人因其可靠性和高效性在工业制造中至关重要。它们在执行简单重复的单臂任务方面表现出色,但在双臂操作方面仍面临挑战。这一困难源于协调双臂和处理多阶段过程的复杂性。近期生成模型与模仿学习的融合在应对特定挑战方面取得了进展。然而,很少有方法在强调推理速度重要性的同时,明确考虑双臂任务的多阶段特性。在多阶段任务中,任何阶段的失败或延迟都可能随时间累积,影响后续子阶段的成功与效率,最终阻碍整体任务性能。本文提出一种专为双臂操作设计的新型关键姿态条件协调感知一致性策略。我们的框架通过高层关键姿态预测器和低层轨迹生成器实例化分层模仿学习。预测的关键姿态作为轨迹生成的子目标,指示各个子阶段的目标。轨迹生成器被构建为一致性模型,基于历史观测和预测的关键姿态在单次推理步骤中生成动作序列。特别地,我们设计了一种创新的双臂关键姿态识别方法,同时考虑机器人中心的行为特征和任务中心的操作风格。仿真与真实世界实验表明,我们的方法在成功率和操作效率方面显著优于基线方法。实现代码可在 https://github.com/JoanaHXU/BiKC-plus 获取。