This work presents Mamba Imitation Learning (MaIL), a novel imitation learning (IL) architecture that provides an alternative to state-of-the-art (SoTA) Transformer-based policies. MaIL leverages Mamba, a state-space model designed to selectively focus on key features of the data. While Transformers are highly effective in data-rich environments due to their dense attention mechanisms, they can struggle with smaller datasets, often leading to overfitting or suboptimal representation learning. In contrast, Mamba's architecture enhances representation learning efficiency by focusing on key features and reducing model complexity. This approach mitigates overfitting and enhances generalization, even when working with limited data. Extensive evaluations on the LIBERO benchmark demonstrate that MaIL consistently outperforms Transformers on all LIBERO tasks with limited data and matches their performance when the full dataset is available. Additionally, MaIL's effectiveness is validated through its superior performance in three real robot experiments. Our code is available at https://github.com/ALRhub/MaIL.
翻译:本文提出Mamba模仿学习(MaIL),这是一种新颖的模仿学习架构,为当前最先进的基于Transformer的策略提供了替代方案。MaIL利用了Mamba这一旨在选择性关注数据关键特征的状态空间模型。虽然Transformer凭借其密集注意力机制在数据丰富的环境中表现出色,但在处理较小数据集时往往面临挑战,容易导致过拟合或次优的表征学习。相比之下,Mamba的架构通过聚焦关键特征并降低模型复杂度,提升了表征学习效率。这种方法即使在数据有限的情况下也能缓解过拟合问题并增强泛化能力。在LIBERO基准测试上的广泛评估表明,在数据受限时,MaIL在所有LIBERO任务上均持续优于Transformer;当使用完整数据集时,其性能与Transformer相当。此外,通过在三项真实机器人实验中取得的优越性能,进一步验证了MaIL的有效性。我们的代码公开于https://github.com/ALRhub/MaIL。