Inducing Overthink: Hierarchical Genetic Algorithm-based DoS Attack on Black-Box Large Language Reasoning Models

Large Reasoning Models (LRMs) are increasingly integrated into systems requiring reliable multi-step inference, yet this growing dependence exposes new vulnerabilities related to computational availability. In particular, LRMs exhibit a tendency to "overthink", producing excessively long and redundant reasoning traces, when confronted with incomplete or logically inconsistent inputs. This behavior significantly increases inference latency and energy consumption, forming a potential vector for denial-of-service (DoS) style resource exhaustion. In this work, we investigate this attack surface and propose an automated black-box framework that induces overthinking in LRMs by systematically perturbing the logical structure of input problems. Our method employs a hierarchical genetic algorithm (HGA) operating on structured problem decompositions, and optimizes a composite fitness function designed to maximize both response length and reflective overthinking markers. Across four state-of-the-art reasoning models, the proposed method substantially amplifies output length, achieving up to a 26.1x increase on the MATH benchmark and consistently outperforming benign and manually crafted missing-premise baselines. We further demonstrate strong transferability, showing that adversarial inputs evolved using a small proxy model retain high effectiveness against large commercial LRMs. These findings highlight overthinking as a shared and exploitable vulnerability in modern reasoning systems, underscoring the need for more robust defenses.

翻译：大型推理模型（Large Reasoning Models, LRMs）正日益集成到需要可靠多步推理的系统中，然而这种日益增长的依赖性暴露了与计算可用性相关的新漏洞。特别是，当面对不完整或逻辑不一致的输入时，LRMs 表现出“过度思考”的倾向，会产生过长且冗余的推理轨迹。这种行为显著增加了推理延迟和能耗，形成了拒绝服务（DoS）式资源耗尽的潜在攻击向量。在本工作中，我们研究了这一攻击面，并提出了一种自动化黑盒框架，通过系统性地扰动输入问题的逻辑结构来诱导 LRMs 的过度思考。我们的方法采用了一种层次遗传算法（Hierarchical Genetic Algorithm, HGA），该算法作用于结构化的问题分解，并优化了一个复合适应度函数，该函数旨在最大化响应长度和反思性过度思考标记。在四个最先进的推理模型上，所提出的方法显著放大了输出长度，在MATH基准测试上实现了高达26.1倍的增加，并持续优于良性及人工构造的缺失前提基线。我们进一步展示了强大的可迁移性，表明使用小型代理模型演化的对抗性输入对大型商业LRMs保持了高有效性。这些发现突显了过度思考是现代推理系统中一种共有的、可被利用的漏洞，强调了需要更鲁棒的防御措施。