Large language models (LLMs) are widely deployed, but their substantial compute demands make them vulnerable to inference cost attacks that aim to deliberately maximize the output length. In this work, we investigate a distinct attack surface: maximizing inference cost by tampering with the model parameters instead of inputs. This approach leverages the established capability of Bit-Flip Attacks (BFAs) to persistently alter model behavior via minute weight perturbations, effectively decoupling the attack from specific input queries. To realize this, we propose BitHydra, a framework that addresses the unique optimization challenge of identifying the exact weight bits that maximize generation cost. We formulate the attack as a constrained Binary Integer Programming (BIP) problem designed to systematically suppress the end-of-sequence (i.e., <eos>) probability. To overcome the intractability of the discrete search space, we relax the problem into a continuous optimization task and solve it via the Alternating Direction Method of Multipliers (ADMM). We evaluate BitHydra across 10 LLMs (1.5B-16B). Our results demonstrate that the proposed optimization method efficiently achieves endless generation with as few as 1-4 bit flips on all testing models, verifying the effectiveness of the ADMM-based formulation against both standard models and potential defenses.
翻译:大型语言模型(LLMs)已得到广泛部署,但其巨大的计算需求使其容易受到旨在故意最大化输出长度的推理成本攻击。在本工作中,我们研究了一种独特的攻击面:通过篡改模型参数而非输入来最大化推理成本。该方法利用了比特翻转攻击(BFAs)通过微小权重扰动持久改变模型行为的既定能力,有效地将攻击与特定输入查询解耦。为实现此目标,我们提出了BitHydra框架,该框架解决了识别能最大化生成成本的确切权重比特这一独特的优化挑战。我们将攻击表述为一个约束性二进制整数规划(BIP)问题,旨在系统性地抑制序列结束(即<eos>)概率。为克服离散搜索空间的难处理性,我们将问题松弛为连续优化任务,并通过交替方向乘子法(ADMM)求解。我们在10个LLM(1.5B-16B)上评估了BitHydra。结果表明,所提出的优化方法在所有测试模型上仅需1-4次比特翻转即可高效实现无限生成,验证了基于ADMM的公式化方法对标准模型及潜在防御措施的有效性。