Mobile Manipulation (MM) involves long-horizon decision-making over multi-stage compositions of heterogeneous skills, such as navigation and picking up objects. Despite recent progress, existing MM methods still face two key limitations: (i) low sample efficiency, due to ineffective use of redundant data generated during long-term MM interactions; and (ii) poor spatial generalization, as policies trained on specific tasks struggle to transfer to new spatial layouts without additional training. In this paper, we address these challenges through Adaptive Experience Selection (AES) and model-based dynamic imagination. In particular, AES makes MM agents pay more attention to critical experience fragments in long trajectories that affect task success, improving skill chain learning and mitigating skill forgetting. Based on AES, a Recurrent State-Space Model (RSSM) is introduced for Model-Predictive Forward Planning (MPFP) by capturing the coupled dynamics between the mobile base and the manipulator and imagining the dynamics of future manipulations. RSSM-based MPFP can reinforce MM skill learning on the current task while enabling effective generalization to new spatial layouts. Comparative studies across different experimental configurations demonstrate that our method significantly outperforms existing MM policies. Real-world experiments further validate the feasibility and practicality of our method.
翻译:移动操作涉及对异构技能(如导航与拾取物体)多阶段组合的长时程决策。尽管近期取得进展,现有移动操作方法仍面临两个关键局限:(i)样本效率低下,源于对长期移动操作交互中产生的冗余数据利用不足;(ii)空间泛化能力差,在特定任务上训练的策略难以迁移到新空间布局而无需重新训练。本文通过自适应经验选择与基于模型的动态想象应对这些挑战。具体而言,自适应经验选择使移动操作智能体更关注长轨迹中影响任务成败的关键经验片段,从而提升技能链学习效果并缓解技能遗忘。基于该方法,我们引入循环状态空间模型,通过捕捉移动底盘与机械臂的耦合动力学特性并推演未来操作的动态过程,实现基于模型预测的前向规划。基于循环状态空间模型的前向规划既能强化当前任务的技能学习,又能实现对新空间布局的有效泛化。不同实验配置的对比研究表明,本方法显著优于现有移动操作策略。实物实验进一步验证了本方法的可行性与实用性。