Programming by demonstration is a strategy to simplify the robot programming process for non-experts via human demonstrations. However, its adoption for bimanual tasks is an underexplored problem due to the complexity of hand coordination, which also hinders data recording. This paper presents a novel one-shot method for processing a single RGB video of a bimanual task demonstration to generate an execution plan for a dual-arm robotic system. To detect hand coordination policies, we apply Shannon's information theory to analyze the information flow between scene elements and leverage scene graph properties. The generated plan is a modular behavior tree that assumes different structures based on the desired arms coordination. We validated the effectiveness of this framework through multiple subject video demonstrations, which we collected and made open-source, and exploiting data from an external, publicly available dataset. Comparisons with existing methods revealed significant improvements in generating a centralized execution plan for coordinating two-arm systems.
翻译:通过演示编程是一种通过人类演示来简化非专家机器人编程过程的策略。然而,由于手部协调的复杂性,其在双手任务中的应用仍是一个尚未充分探索的问题,这也阻碍了数据记录。本文提出了一种新颖的单次学习方法,用于处理单个双手任务演示的RGB视频,以生成双臂机器人系统的执行计划。为了检测手部协调策略,我们应用香农信息论来分析场景元素之间的信息流,并利用场景图属性。生成的计划是一个模块化行为树,其根据期望的手臂协调方式采用不同的结构。我们通过多个受试者视频演示(我们收集并开源了这些数据)以及利用外部公开可用数据集的数据,验证了该框架的有效性。与现有方法的比较表明,在生成用于协调双臂系统的集中式执行计划方面,本方法取得了显著改进。