We introduce a lifelong imitation learning framework that enables continual policy refinement across sequential tasks under realistic memory and data constraints. Our approach departs from conventional experience replay by operating entirely in a multimodal latent space, where compact representations of visual, linguistic, and robot's state information are stored and reused to support future learning. To further stabilize adaptation, we introduce an incremental feature adjustment mechanism that regularizes the evolution of task embeddings through an angular margin constraint, preserving inter-task distinctiveness. Our method establishes a new state of the art in the LIBERO benchmarks, achieving 10-17 point gains in AUC and up to 65% less forgetting compared to previous leading methods. Ablation studies confirm the effectiveness of each component, showing consistent gains over alternative strategies. The code is available at: https://github.com/yfqi/lifelong_mlr_ifa.
翻译:本文提出了一种终身模仿学习框架,能够在实际内存与数据约束下实现跨序列任务的持续策略优化。与传统经验回放方法不同,本方法完全在多模态潜在空间中运行,通过存储和复用视觉、语言及机器人状态信息的紧凑表征来支撑持续学习。为进一步提升适应稳定性,我们引入增量特征调整机制,通过角间隔约束对任务嵌入向量的演化过程进行正则化,从而保持任务间的区分性。本方法在LIBERO基准测试中取得了最优性能,与先前领先方法相比,AUC指标提升10-17个百分点,遗忘率降低达65%。消融实验验证了各模块的有效性,证明其相对替代策略具有持续优势。代码已开源:https://github.com/yfqi/lifelong_mlr_ifa。