Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e.g., occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints. Project page at https://chengkaiacademycity.github.io/EnvAwareAfford/
翻译:感知并操作不同环境下各类3D可动部件物体对家庭辅助机器人至关重要。近期研究表明,点级可供性可为后续操作任务提供可行动先验。然而,现有工作主要聚焦于单一物体场景与同质化智能体,忽视了环境与智能体形态施加的现实约束,例如遮挡与物理限制。本文提出一种环境感知可供性框架,该框架同时融合物体层级可行动先验与环境约束。与以物体为中心的可供性方法不同,学习环境感知可供性面临因复杂遮挡(其数量、几何形状、位置与姿态)导致的组合爆炸挑战。为解决此问题并提升数据效率,我们引入一种新颖的对比学习可供性框架,该框架可在包含单一遮挡物的场景中训练,并泛化至具有复杂遮挡物组合的场景。实验证明了所提方法在学习考虑环境约束的可供性方面的有效性。项目页面:https://chengkaiacademycity.github.io/EnvAwareAfford/