Despite impressive performance on language modelling and complex reasoning tasks, Large Language Models (LLMs) fall short on the same tasks in uncommon settings or with distribution shifts, exhibiting some lack of generalisation ability. This issue has usually been alleviated by feeding more training data into the LLM. However, this method is brittle, as the scope of tasks may not be readily predictable or may evolve, and updating the model with new data generally requires extensive additional training. By contrast, systems, such as causal models, that learn abstract variables and causal relationships can demonstrate increased robustness against changes in the distribution. One reason for this success is the existence and use of Independent Causal Mechanisms (ICMs) representing high-level concepts that only sparsely interact. In this work, we apply two concepts from causality to learn ICMs within LLMs. We develop a new LLM architecture composed of multiple sparsely interacting language modelling modules. We introduce a routing scheme to induce specialisation of the network into domain-specific modules. We also present a Mutual Information minimisation objective that trains a separate module to learn abstraction and domain-invariant mechanisms. We show that such causal constraints can improve out-of-distribution performance on abstract and causal reasoning tasks.
翻译:尽管在语言建模和复杂推理任务上表现令人印象深刻,但大型语言模型在不常见场景或分布偏移条件下执行同等任务时仍显不足,暴露出一定的泛化能力缺陷。这一问题通常通过向模型输入更多训练数据来缓解,但该方法较为脆弱——因为任务范围可能难以预测或会持续演变,且用新数据更新模型通常需要大量额外训练。相比之下,学习抽象变量与因果关系的因果模型等系统,能在分布变化时表现出更强的鲁棒性。这种成功的原因之一在于独立因果机制的存在与运用——这些机制代表了仅发生稀疏交互的高层概念。在本工作中,我们借鉴因果理论的两个核心概念,在大型语言模型中学习独立因果机制。我们构建了一种由多个稀疏交互的语言建模模块组成的新型LLM架构,通过引入路由机制促使网络专门化为领域特定模块,并提出了基于互信息最小化的优化目标,训练独立模块学习抽象化与域不变机制。实验表明,此类因果约束能有效提升模型在抽象推理与因果推理任务上的分布外性能。