Large language models exhibit a systematic tendency toward early semantic commitment: given ambiguous input, they collapse multiple valid interpretations into a single response before sufficient context is available. We present a formal framework for text-to-state mapping ($φ: \mathcal{T} \to \mathcal{S}$) that transforms natural language into a non-collapsing state space where multiple interpretations coexist. The mapping decomposes into three stages: conflict detection, interpretation extraction, and state construction. We instantiate $φ$ with a hybrid extraction pipeline combining rule-based segmentation for explicit conflict markers (adversative conjunctions, hedging expressions) with LLM-based enumeration of implicit ambiguity (epistemic, lexical, structural). On a test set of 68 ambiguous sentences, the resulting states preserve interpretive multiplicity: mean state entropy $H = 1.087$ bits across ambiguity categories, compared to $H = 0$ for collapse-based baselines. We additionally instantiate the rule-based conflict detector for Japanese markers to illustrate cross-lingual portability. This framework extends Non-Resolution Reasoning (NRR) by providing the missing algorithmic bridge between text and the NRR state space, enabling architectural collapse deferment in LLM inference. Design principles for state-to-state transformations are detailed in the Appendix, with empirical validation on 580 test cases showing 0% collapse for principle-satisfying operators versus up to 17.8% for violating operators.
翻译:大语言模型表现出一种系统性的早期语义承诺倾向:给定歧义输入时,在获得足够上下文之前,它们会将多种有效解释坍缩为单一响应。我们提出了一个用于文本到状态映射($φ: \mathcal{T} \to \mathcal{S}$)的形式化框架,该框架将自然语言转换为一个非坍缩的状态空间,使得多种解释能够共存。该映射分解为三个阶段:冲突检测、解释提取和状态构建。我们通过一个混合提取流程来实例化 $φ$,该流程结合了基于规则的分割(用于检测显式冲突标记,如转折连词、模糊表达)和基于LLM的枚举(用于提取隐式歧义,如认知、词汇、结构歧义)。在一个包含68个歧义句子的测试集上,生成的状态保持了解释的多样性:跨歧义类别的平均状态熵 $H = 1.087$ 比特,而基于坍缩的基线方法 $H = 0$。我们还针对日语标记实例化了基于规则的冲突检测器,以说明其跨语言可移植性。该框架通过提供文本与NRR状态空间之间缺失的算法桥梁,扩展了非消解推理(NRR),从而实现了LLM推理中的架构级坍缩延迟。状态到状态转换的设计原则在附录中详述,并在580个测试案例上进行了实证验证,结果表明满足这些原则的算子坍缩率为0%,而违反原则的算子坍缩率最高可达17.8%。