This work provides empirical evidence that Mamba, a newly proposed selective structured state space model, has similar in-context learning (ICL) capabilities as transformers. We evaluated Mamba on tasks involving simple function approximation as well as more complex natural language processing problems. Our results demonstrate that across both categories of tasks, Mamba matches the performance of transformer models for ICL. Further analysis reveals that like transformers, Mamba appears to solve ICL problems by incrementally optimizing its internal representations. Overall, our work suggests that Mamba can be an efficient alternative to transformers for ICL tasks involving longer input sequences.
翻译:本研究提供了实证证据,表明新提出的选择性结构状态空间模型Mamba具备与Transformer相似的上下文学习(ICL)能力。我们在简单函数逼近以及更复杂的自然语言处理任务上对Mamba进行了评估。结果表明,在这两类任务中,Mamba的ICL性能均与Transformer模型相当。进一步分析显示,与Transformer类似,Mamba似乎通过逐步优化其内部表征来解决ICL问题。总体而言,我们的研究表明,在处理涉及较长输入序列的ICL任务时,Mamba可作为Transformer的高效替代方案。