ADEMA: A Knowledge-State Orchestration Architecture for Long-Horizon Knowledge Synthesis with LLMAgents

Long-horizon LLM tasks often fail not because a single answer is unattainable, but because knowledge states drift across rounds, intermediate commitments remain implicit, and interruption fractures the evolving evidence chain. This paper presents ADEMA as a knowledge-state orchestration architecture for long-horizon knowledge synthesis rather than as a generic multi-agent runtime. The architecture combines explicit epistemic bookkeeping, heterogeneous dual-evaluator governance, adaptive task-mode switching, reputation-shaped resource allocation, checkpoint-resumable persistence, segment-level memory condensation, artifact-first assembly, and final-validity checking with safe fallback. Evidence is drawn entirely from existing materials: a four-scenario showcase package, a fixed 60-run mechanism matrix, targeted micro-ablation and artifact-chain supplements, and a repaired protocol-level benchmark in which code-oriented evaluation is the clearest quality-sensitive mechanism block. Across the fixed matrix, removing checkpoint/resume produced the only invalid run, and it did so in the interruption-sensitive resume condition. By contrast, dual evaluation, segment synthesis, and dynamic governance are best interpreted as supporting control mechanisms that shape trajectory discipline, explicit artifact progression, and cost-quality behavior rather than as universal binary prerequisites for completion. The contribution is therefore a knowledge-state orchestration architecture in which explicit epistemic state transition, evidence-bearing artifact progression, and recoverable continuity are the primary design commitments.

翻译：长期大语言模型任务失败的原因往往不在于无法获得单一答案，而在于知识状态在轮次间发生漂移、中间承诺保持隐式状态以及中断导致不断演化的证据链断裂。本文提出ADEMA作为一种面向长时知识合成的知识状态编排架构，而非通用的多智能体运行时环境。该架构融合了显式知识簿记、异构双评估器治理、自适应任务模式切换、声誉驱动的资源分配、可检查点恢复的持久化机制、片段级记忆浓缩、构件优先组装以及带安全回退的最终有效性验证。证据全部来源于现有材料：包含四个场景的演示包、固定60次运行的机制矩阵、定向微型消融实验与构件链补充实验，以及经过修复的协议级基准测试——其中面向代码的评估是最清晰的质量敏感机制模块。在固定矩阵实验中，仅移除检查点/恢复功能即产生无效运行，且该情况恰好发生在对中断敏感的恢复条件下。相比之下，双评估器、片段合成与动态治理更适合解释为塑造轨迹规范性、显式构件演进与成本-质量行为的支撑控制机制，而非实现任务完成的普适性二元前提条件。因此，本文的核心贡献在于提出一种知识状态编排架构，其核心设计原则是显式知识状态转换、承载证据的构件演进与可恢复的连续性。