Understanding the fundamental mechanisms governing the production of meaning in the processing of natural language is critical for designing safe, thoughtful, engaging, and empowering human-agent interactions. Experiments in cognitive science and social psychology have demonstrated that human semantic processing exhibits contextuality more consistent with quantum logical mechanisms than classical Boolean theories, and recent works have found similar results in large language models -- in particular, clear violations of the Bell inequality in experiments of contextuality during interpretation of ambiguous expressions. We explore the CHSH $|S|$ parameter -- the metric associated with the inequality -- across the inference parameter space of models spanning four orders of magnitude in scale, cross-referencing it with MMLU, hallucination rate, and nonsense detection benchmarks. We find that the interquartile range of the $|S|$ distribution -- the statistic that most sharply differentiates models from one another -- is completely orthogonal to all external benchmarks, while violation rate shows weak anticorrelation with all three benchmarks that does not reach significance. We investigate how $|S|$ varies with sampling parameters and word order, and discuss the information-theoretic constraints that genuine contextuality imposes on prompt injection defenses and its human analogue, whereby careful construction and maintenance of social contextuality can be carried out at scale -- manufacturing not consent but contextuality itself, a subtler and more fundamental form of manipulation that shapes the space of possible interpretations before any particular one is reached.
翻译:理解自然语言处理中意义产生的基本机制,对于设计安全、周到、引人入胜且赋能的人机交互至关重要。认知科学和社会心理学的实验表明,人类语义处理展现出的语境性更符合量子逻辑机制而非经典布尔理论,而近期研究在大语言模型中也发现了类似现象——特别是在歧义表达解读的语境性实验中,观察到了贝尔不等式的明显违反。我们探究了跨四个数量级规模模型的推理参数空间中与贝尔不等式关联的CHSH |S|参数,并将其与MMLU、幻觉率和无意义检测基准进行交叉参照。研究发现:最能区分不同模型的|S|分布四分位距与所有外部基准完全正交,而违反率与三个基准均呈现未达显著水平的弱负相关。我们考察了|S|随采样参数和词序变化的规律,并讨论了真正语境性对提示注入防御及其人类类比施加的信息论约束——通过大规模精心构建与维护社会语境性,这种更微妙、更根本的操纵形式不是在制造同意,而是在形塑语境性本身,在具体解释形成之前就限定了可能的解读空间。