Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e.g., occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints. Project page at https://chengkaiacademycity.github.io/EnvAwareAfford/
翻译:在多样环境中感知并操纵3D铰接物体是家庭辅助机器人的关键能力。近年研究表明,点级可操作属性可为下游操纵任务提供可操作的先验信息。然而,现有工作主要聚焦于同质代理的单物体场景,忽视了环境与代理形态施加的真实约束(例如遮挡与物理限制)。本文提出一种融合物体级可操作先验与环境约束的环境感知可操作属性框架。与以物体为中心的可操作属性学习方法不同,由于各类遮挡(由其数量、几何形状、位置和姿态刻画)的复杂性,学习环境感知可操作属性面临组合爆炸的挑战。为解决该问题并提升数据效率,我们引入一种新颖的对比式可操作属性学习框架,该框架可在仅含单一遮挡物的场景上训练,并泛化至含复杂遮挡物组合的场景。实验表明,所提方法在学习考虑环境约束的可操作属性方面具有有效性。项目主页:https://chengkaiacademycity.github.io/EnvAwareAfford/