Large language models (LLMs) face significant challenges when processing complex rule systems, as they typically treat interdependent rules as unstructured textual data rather than as logically organized frameworks. This limitation results in reasoning divergence, where models often overlook critical rule dependencies essential for accurate interpretation. Although existing approaches such as Chain-of-Thought (CoT) reasoning have shown promise, they lack systematic methodologies for structured rule processing and are particularly susceptible to error propagation through sequential reasoning chains. To address these limitations, we propose the Dynamic Adjudication Template (DAT), a novel framework inspired by expert human reasoning processes. DAT structures the inference mechanism into three methodical stages: qualitative analysis, evidence gathering, and adjudication. During the qualitative analysis phase, the model comprehensively evaluates the contextual landscape. The subsequent evidence gathering phase involves the targeted extraction of pertinent information based on predefined template elements ([placeholder]), followed by systematic verification against applicable rules. Finally, in the adjudication phase, the model synthesizes these validated components to formulate a comprehensive judgment. Empirical results demonstrate that DAT consistently outperforms conventional CoT approaches in complex rule-based tasks. Notably, DAT enables smaller language models to match, and in some cases exceed, the performance of significantly larger LLMs, highlighting its efficiency and effectiveness in managing intricate rule systems.
翻译:大型语言模型在处理复杂规则系统时面临显著挑战,因为它们通常将相互依存的规则视为非结构化的文本数据,而非逻辑组织的框架。这一局限导致推理偏差,使模型经常忽略准确解释所必需的关键规则依赖关系。尽管现有方法如思维链推理已展现出潜力,但它们缺乏结构化规则处理的系统化方法,且特别容易受到顺序推理链中错误传播的影响。为应对这些局限,我们提出动态裁决模板——一种受人类专家推理过程启发的新型框架。该框架将推理机制结构化为三个系统化阶段:定性分析、证据收集与裁决。在定性分析阶段,模型全面评估上下文情境;随后的证据收集阶段基于预定义模板元素进行针对性信息提取,并依据适用规则进行系统化验证;最后在裁决阶段,模型综合这些已验证的组件以形成全面判断。实证结果表明,在基于规则的复杂任务中,动态裁决模板始终优于传统思维链方法。值得注意的是,该框架能使较小规模的语言模型达到甚至超越显著大型模型的性能,突显了其在处理复杂规则系统时的高效性与有效性。