Articulated object manipulation is a fundamental yet challenging task in robotics. Due to significant geometric and semantic variations across object categories, previous manipulation models struggle to generalize to novel categories. Few-shot learning is a promising solution for alleviating this issue by allowing robots to perform a few interactions with unseen objects. However, extant approaches often necessitate costly and inefficient test-time interactions with each unseen instance. Recognizing this limitation, we observe that despite their distinct shapes, different categories often share similar local geometries essential for manipulation, such as pullable handles and graspable edges - a factor typically underutilized in previous few-shot learning works. To harness this commonality, we introduce 'Where2Explore', an affordance learning framework that effectively explores novel categories with minimal interactions on a limited number of instances. Our framework explicitly estimates the geometric similarity across different categories, identifying local areas that differ from shapes in the training categories for efficient exploration while concurrently transferring affordance knowledge to similar parts of the objects. Extensive experiments in simulated and real-world environments demonstrate our framework's capacity for efficient few-shot exploration and generalization.
翻译:铰接物体操作是机器人领域一项基础且具有挑战性的任务。由于不同物体类别间存在显著的几何与语义差异,先前的操作模型难以泛化至新类别。少样本学习通过允许机器人与未见物体进行少量交互,为缓解这一问题提供了有前景的解决方案。然而,现有方法通常需要对每个未见实例进行昂贵且低效的测试时交互。针对这一局限,我们发现尽管不同类别形状各异,但常共享局部几何结构(如可拉拽把手和可抓取边缘),而这些对操作至关重要的结构在先前少样本学习研究中未被充分利用。为利用这一共性,我们提出"Where2Explore"——一种可供性学习框架,能够通过最少交互在有限实例上高效探索新类别。该框架显式估计跨类别几何相似性,识别与训练类别形状不同的局部区域以实现高效探索,同时将可供性知识迁移至物体相似部件。在仿真和真实环境中的大量实验表明,该框架具有高效的少样本探索与泛化能力。