We investigate whether hidden states from Structured State Space Models (SSMs) can be merged post hoc to support downstream reasoning. Inspired by model souping, we study document souping, a strategy where documents are encoded independently, and their representations are pooled, via simple operations like averaging, into a single context state. This approach enables modular encoding and reuse without reprocessing the full input for each query. We demonstrate that finetuned Mamba2 models with souped representations achieve competitive or superior performance across multi-hop QA, sparse retrieval, and long-document reasoning tasks compared to the standard monolithic encoding approach. For example, on the RACE and QuALITY benchmarks for long document question answering, this method substantially outperforms a traditional concatenation approach. Crucially, this modular design scales to hundreds of documents while delivering substantial savings in inference cost, unlocking new possibilities for large-scale corpus reasoning.
翻译:本研究探讨结构化状态空间模型(SSMs)的隐藏状态是否能够在训练后合并以支持下游推理任务。受模型融合技术的启发,我们提出文档融合策略:通过独立编码文档,并利用平均等简单操作将其表征池化为单一上下文状态。该方法支持模块化编码与复用,无需针对每个查询重新处理完整输入。实验表明,采用融合表征的微调Mamba2模型在多跳问答、稀疏检索和长文档推理任务中,相比标准整体编码方法展现出相当或更优的性能。例如,在长文档问答基准RACE和QuALITY上,该方法显著优于传统的文档拼接方法。关键的是,这种模块化设计可扩展至数百个文档,同时大幅降低推理成本,为大规模语料库推理开辟了新路径。