Articulated object manipulation is a fundamental yet challenging task in robotics. Due to significant geometric and semantic variations across object categories, previous manipulation models struggle to generalize to novel categories. Few-shot learning is a promising solution for alleviating this issue by allowing robots to perform a few interactions with unseen objects. However, extant approaches often necessitate costly and inefficient test-time interactions with each unseen instance. Recognizing this limitation, we observe that despite their distinct shapes, different categories often share similar local geometries essential for manipulation, such as pullable handles and graspable edges - a factor typically underutilized in previous few-shot learning works. To harness this commonality, we introduce 'Where2Explore', an affordance learning framework that effectively explores novel categories with minimal interactions on a limited number of instances. Our framework explicitly estimates the geometric similarity across different categories, identifying local areas that differ from shapes in the training categories for efficient exploration while concurrently transferring affordance knowledge to similar parts of the objects. Extensive experiments in simulated and real-world environments demonstrate our framework's capacity for efficient few-shot exploration and generalization.
翻译:铰接物体操作是机器人领域一项基础但具有挑战性的任务。由于不同物体类别在几何形状与语义属性上存在显著差异,先前的操作模型难以泛化至新颖类别。少样本学习通过允许机器人与未见物体进行少量交互,为缓解该问题提供了有前景的解决方案。然而现有方法通常需要对每个未见实例进行代价高昂且低效的测试时交互。针对此局限性,我们观察到不同类别尽管形状各异,但往往共享局部几何特征——例如可拉把手与可抓取边缘——这些关键操作特征在以往少样本学习研究中未得到充分利用。为利用这一共性,我们提出"Where2Explore"功能可供性学习框架,通过仅在有限实例上进行最少交互,有效探索新颖类别。本框架显式估计跨类别几何相似性,识别与训练类别形状存在差异的局部区域以实现高效探索,同时将功能可供性知识迁移至物体的相似部位。在仿真与真实环境中的大量实验表明,我们的框架具备高效少样本探索与泛化能力。