Large language models (LLMs) have demonstrated impressive capabilities in code generation by leveraging retrieval-augmented generation (RAG) methods. However, the computational costs associated with LLM inference, particularly in terms of latency and energy consumption, have received limited attention in the security context. This paper introduces DrainCode, the first adversarial attack targeting the computational efficiency of RAG-based code generation systems. By strategically poisoning retrieval contexts through a mutation-based approach, DrainCode forces LLMs to produce significantly longer outputs, thereby increasing GPU latency and energy consumption. We evaluate the effectiveness of DrainCode across multiple models. Our experiments show that DrainCode achieves up to an 85% increase in latency, a 49% increase in energy consumption, and more than a 3x increase in output length compared to the baseline. Furthermore, we demonstrate the generalizability of the attack across different prompting strategies and its effectiveness compared to different defenses. The results highlight DrainCode as a potential method for increasing the computational overhead of LLMs, making it useful for evaluating LLM security in resource-constrained environments. We provide code and data at https://github.com/DeepSoftwareAnalytics/DrainCode.
翻译:大型语言模型(LLMs)通过采用检索增强生成(RAG)方法,在代码生成方面展现出卓越能力。然而,LLM推理相关的计算成本(尤其在延迟和能耗方面)在安全研究领域尚未得到充分关注。本文提出了DrainCode——首个针对基于RAG的代码生成系统计算效率的对抗性攻击。该方法通过基于变异的策略对检索上下文进行定向投毒,迫使LLMs生成显著更长的输出,从而增加GPU延迟与能耗。我们在多个模型上评估了DrainCode的有效性。实验表明,与基线相比,DrainCode可实现高达85%的延迟增长、49%的能耗提升以及超过3倍的输出长度增加。此外,我们验证了该攻击在不同提示策略下的普适性,并对比了不同防御措施的有效性。研究结果凸显了DrainCode作为增加LLMs计算开销的潜在手段,对于评估资源受限环境下的LLM安全性具有重要价值。代码与数据详见https://github.com/DeepSoftwareAnalytics/DrainCode。