Large Language Models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This degrades answer quality, inflates latency and cost, and can be weaponized as a denial-of-service (DoS) attack. Recent work has begun to study DoS-style prompt attacks, but typically focuses on a single attack algorithm or assumes white-box access, without an attack-side benchmark that compares prompt-based attackers in a black-box, query-only regime with a known tokenizer. We introduce such a benchmark and study two prompt-only attackers. The first is an Evolutionary Over-Generation Prompt Search (EOGen) that searches the token space for prefixes that suppress EOS and induce long continuations. The second is a goal-conditioned reinforcement learning attacker (RL-GOAL) that trains a network to generate prefixes conditioned on a target length. To characterize behavior, we introduce Over-Generation Factor (OGF): the ratio of produced tokens to a model's context window, along with stall and latency summaries. EOGen discovers short-prefix attacks that raise Phi-3 to OGF = 1.39 +/- 1.14 (Success@>=2: 25.2%); RL-GOAL nearly doubles severity to OGF = 2.70 +/- 1.43 (Success@>=2: 64.3%) and drives budget-hit non-termination in 46% of trials.
翻译:大型语言模型(LLMs)可能被诱导进入过度生成状态,在生成序列结束(EOS)标记前输出数千个标记。这会降低回答质量、增加延迟与成本,并可被武器化为拒绝服务(DoS)攻击。近期研究已开始关注DoS式提示攻击,但通常仅针对单一攻击算法或假设白盒访问权限,缺乏在已知分词器的黑盒、仅查询场景下比较基于提示的攻击者的攻击侧基准。我们提出了此类基准,并研究了两种仅提示攻击方法。第一种是进化式过度生成提示搜索(EOGen),该方法在标记空间中搜索能抑制EOS并诱导长序列延续的前缀。第二种是基于目标条件强化学习的攻击者(RL-GOAL),其训练网络生成以目标长度为条件的前缀。为量化攻击行为,我们引入了过度生成因子(OGF):即生成标记数与模型上下文窗口长度的比值,并辅以停滞与延迟统计。EOGen发现了可使Phi-3模型达到OGF = 1.39 +/- 1.14(成功率@≥2:25.2%)的短前缀攻击;RL-GOAL则将攻击强度提升近一倍至OGF = 2.70 +/- 1.43(成功率@≥2:64.3%),并在46%的试验中导致预算耗尽型非终止现象。