Large Reasoning Models (LRMs) employ reasoning to address complex tasks. Such explicit reasoning requires extended context lengths, resulting in substantially higher resource consumption. Prior work has shown that adversarially crafted inputs can trigger redundant reasoning processes, exposing LRMs to resource-exhaustion vulnerabilities. However, the reasoning process itself, especially its reflective component, has received limited attention, even though it can lead to over-reflection and consume excessive computing power. In this paper, we introduce Recursive Entropy to quantify the risk of resource consumption in reflection, thereby revealing the safety issues inherent in inference itself. Based on Recursive Entropy, we introduce RECUR, a resource exhaustion attack via Recursive Entropy guided Counterfactual Utilization and Reflection. It constructs counterfactual questions to verify the inherent flaws and risks of LRMs. Extensive experiments demonstrate that, under benign inference, recursive entropy exhibits a pronounced decreasing trend. RECUR disrupts this trend, increasing the output length by up to 11x and decreasing throughput by 90%. Our work provides a new perspective on robust reasoning.
翻译:大型推理模型(LRMs)通过推理处理复杂任务。这种显式推理需要较长的上下文长度,导致资源消耗显著增加。先前研究表明,对抗性构造的输入可能触发冗余推理过程,使LRMs面临资源耗尽漏洞。然而,推理过程本身,特别是其反思环节,尽管可能导致过度反思并消耗过量计算资源,却尚未得到充分关注。本文引入递归熵以量化反思过程中的资源消耗风险,从而揭示推理本身固有的安全问题。基于递归熵,我们提出RECUR——一种通过递归熵引导的反事实利用与反思实现的资源耗尽攻击。该方法通过构建反事实问题来验证LRMs的内在缺陷与风险。大量实验表明,在良性推理过程中,递归熵呈现显著下降趋势;而RECUR会破坏该趋势,使输出长度最高增加11倍,吞吐量降低90%。本研究为鲁棒推理提供了新的视角。