Question decomposition, i.e. breaking a complex query into simpler sub-queries whose answers are composed to produce a final answer, is a widely used strategy for improving LLM reasoning, yet it currently lacks a rigorous mathematical foundation. In this paper, we propose operads, mathematical structures that model many-in, one-out operations and compositions thereof, as a natural framework for describing question decomposition. We define the questions operad $Q$, in which operations correspond to question templates and composition corresponds to substitution of sub-answers, and show how QA models can be interpreted as algebras over $Q$. Beyond reframing existing practice, this operadic perspective points toward new methods, in particular a notion of operadic consistency, which measures whether a QA model's answers agree across the partial collapses of a question decomposition tree. Empirical evaluation of operadic consistency is reported in our companion paper (Bottman, Liu, and Richardson, 2026), which finds it strongly correlated with accuracy across twelve LLMs and four multi-hop QA datasets and outperforming standard temperature-based self-consistency baselines. We argue that operads are the natural mathematical home for question decomposition, and that invariants such as operadic consistency open new directions for analyzing and improving the reliability of multi-step reasoning.
翻译:问题分解——即将复杂查询拆解为若干更简单的子查询,通过组合各子查询答案生成最终答案——是提升大语言模型推理能力的常用策略,但目前尚缺乏严格的数学基础。本文提出将Operads(一种描述多输入单输出运算及其组合的数学结构)作为问题分解的自然框架。我们定义问题Operad $Q$,其中运算对应问题模板,组合对应子答案替换,并展示问答模型可解释为$Q$上的代数。这种Operad视角不仅重述了现有实践,还指向了新方法,特别是Operad一致性概念——该概念衡量问答模型的答案在问题分解树局部坍塌过程中的一致性。我们的伴随论文(Bottman, Liu, and Richardson, 2026)报告了Operad一致性的实证评估,发现在十二种大语言模型和四个多跳问答数据集上,该指标与准确率高度相关,且优于基于温度的标准自一致性基线。我们认为Operads是问题分解的自然数学框架,而Operad一致性等不变量为分析和改进多步推理的可靠性开辟了新方向。