Tackling complex reasoning tasks typically relies on massive monolithic LLMs, which suffer from severe computational redundancy. While task decomposition through structured pipelines or multi-agent collaborations offers an alternative, these approaches inevitably fall into a critical dilemma: predefined static topologies are highly vulnerable to cascading errors, whereas unconstrained dynamic agents suffer from trajectory divergence and unpredictable memory bloat. To address this, we present DynaGraph, a lightweight multi-model framework driven by dynamic topological reconfiguration. At the execution level, DynaGraph multiplexes time-division PEFT adapters over a shared base model, enabling both full system training and inference deployment on a single consumer-grade GPU. At the routing level, the Evaluator continuously monitors execution confidence to trigger hierarchical self-healing: Fine-grained Patching for localized data gaps and Subgraph Reconstruction for severe logical ruptures. Experiments on StrategyQA, MATH, and FinQA demonstrate our 8B model closely approximates the reasoning capabilities of a 72B monolithic model (e.g., 87.6% on StrategyQA, 82.7% on MATH). Furthermore, it reduces latency by up to 68.1% and token consumption by 68.6% compared to unconstrained dynamic architectures.
翻译:处理复杂推理任务通常依赖于大规模单体LLM,但这类模型存在严重的计算冗余问题。尽管通过结构化流水线或多智能体协作进行任务分解提供了一种替代方案,但这些方法不可避免地陷入关键困境:预定义的静态拓扑极易产生级联错误,而无约束的动态智能体则面临轨迹发散和不可预测的内存膨胀。为此,我们提出DynaGraph——一种由动态拓扑重构驱动的轻量级多模型框架。在执行层面,DynaGraph在共享基础模型上复用时分PEFT适配器,使得完整系统训练与推理部署均可在一块消费级GPU上完成。在路由层面,评估器持续监控执行置信度以触发分层自愈:针对局部数据间隙的细粒度修补,以及针对严重逻辑断裂的子图重构。在StrategyQA、MATH和FinQA上的实验表明,我们的8B模型在推理能力上接近72B单体模型(例如StrategyQA达87.6%,MATH达82.7%)。此外,与无约束动态架构相比,其延迟降低达68.1%,令牌消耗减少68.6%。