Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e.g., occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints. Project page at https://chengkaiacademycity.github.io/EnvAwareAfford/
翻译:在多样化环境中感知并操作3D关节物体对于家庭辅助机器人至关重要。近年研究表明,点级可供性为下游操作任务提供了可操作的先验知识。然而,现有研究主要聚焦于同质代理体下的单物体场景,忽视了环境与代理体形态带来的现实约束,例如遮挡与物理限制。本文提出一种融合物体级可操作先验与环境约束的环境感知可供性框架。与以物体为中心的可供性方法不同,环境感知可供性学习面临由不同遮挡物数量、几何形状、位置与姿态表征的复杂遮挡所导致的组合爆炸挑战。为解决此问题并提升数据效率,我们引入一种新型对比可供性学习框架,该框架可在包含单个遮挡物的场景中训练,并泛化至复杂遮挡物组合场景。实验证明,本文方法在学习考虑环境约束的可供性方面具有有效性。项目页面:https://chengkaiacademycity.github.io/EnvAwareAfford/