We introduce a lifelong imitation learning framework that enables continual policy refinement across sequential tasks under realistic memory and data constraints. Our approach departs from conventional experience replay by operating entirely in a multimodal latent space, where compact representations of visual, linguistic, and robot's state information are stored and reused to support future learning. To further stabilize adaptation, we introduce an incremental feature adjustment mechanism that regularizes the evolution of task embeddings through an angular margin constraint, preserving inter-task distinctiveness. Our method establishes a new state of the art in the LIBERO benchmarks, achieving 10-17 point gains in AUC and up to 65% less forgetting compared to previous leading methods. Ablation studies confirm the effectiveness of each component, showing consistent gains over alternative strategies. The code is available at: https://github.com/yfqi/lifelong_mlr_ifa.
翻译:本文提出了一种终身模仿学习框架,能够在现实内存与数据约束下,对连续任务实现持续的策略优化。我们的方法完全在多模态潜在空间中运行,与传统经验重放机制不同:该空间存储并复用了视觉、语言及机器人状态信息的紧凑表征,以支持后续学习。为进一步提升适应稳定性,我们引入了一种增量特征调整机制,通过角度间隔约束对任务嵌入的演化过程进行正则化,从而保持任务间的区分性。在LIBERO基准测试中,本方法取得了当前最优性能,其AUC指标较先前领先方法提升10-17个点,遗忘率降低达65%。消融实验验证了各模块的有效性,表明其相对替代策略具有持续优势。代码已开源:https://github.com/yfqi/lifelong_mlr_ifa。