AI-based code generators have become pivotal in assisting developers in writing software starting from natural language (NL). However, they are trained on large amounts of data, often collected from unsanitized online sources (e.g., GitHub, HuggingFace). As a consequence, AI models become an easy target for data poisoning, i.e., an attack that injects malicious samples into the training data to generate vulnerable code. To address this threat, this work investigates the security of AI code generators by devising a targeted data poisoning strategy. We poison the training data by injecting increasing amounts of code containing security vulnerabilities and assess the attack's success on different state-of-the-art models for code generation. Our study shows that AI code generators are vulnerable to even a small amount of poison. Notably, the attack success strongly depends on the model architecture and poisoning rate, whereas it is not influenced by the type of vulnerabilities. Moreover, since the attack does not impact the correctness of code generated by pre-trained models, it is hard to detect. Lastly, our work offers practical insights into understanding and potentially mitigating this threat.
翻译:基于人工智能的代码生成器在帮助开发者从自然语言编写软件方面变得至关重要。然而,这些模型使用大量数据进行训练,这些数据通常来自未经净化的在线来源(例如GitHub、HuggingFace)。因此,AI模型成为数据投毒的易攻击目标,即一种通过向训练数据注入恶意样本来生成易受攻击代码的攻击。针对这一威胁,本研究通过设计一种定向数据投毒策略来调查AI代码生成器的安全性。我们通过注入含有安全漏洞的代码(逐步增加数量)来污染训练数据,并评估攻击对各类最先进的代码生成模型的成功率。研究表明,AI代码生成器即使对少量投毒数据也颇为脆弱。值得注意的是,攻击成功率强烈依赖于模型架构和投毒率,而不受漏洞类型的影响。此外,由于攻击不会影响预训练模型生成代码的正确性,因此难以被检测。最后,我们的工作为理解和潜在缓解这一威胁提供了实用见解。