Pre-trained programming language (PL) models (such as CodeT5, CodeBERT, GraphCodeBERT, etc.,) have the potential to automate software engineering tasks involving code understanding and code generation. However, these models operate in the natural channel of code, i.e., they are primarily concerned with the human understanding of the code. They are not robust to changes in the input and thus, are potentially susceptible to adversarial attacks in the natural channel. We propose, CodeAttack, a simple yet effective black-box attack model that uses code structure to generate effective, efficient, and imperceptible adversarial code samples and demonstrates the vulnerabilities of the state-of-the-art PL models to code-specific adversarial attacks. We evaluate the transferability of CodeAttack on several code-code (translation and repair) and code-NL (summarization) tasks across different programming languages. CodeAttack outperforms state-of-the-art adversarial NLP attack models to achieve the best overall drop in performance while being more efficient, imperceptible, consistent, and fluent. The code can be found at https://github.com/reddy-lab-code-research/CodeAttack.
翻译:预训练编程语言(PL)模型(如CodeT5、CodeBERT、GraphCodeBERT等)具备自动化涉及代码理解和生成的软件工程任务的潜力。然而,这些模型在代码的自然通道中运行,即它们主要关注人对代码的理解。它们对输入变化缺乏鲁棒性,因此在自然通道中可能容易受到对抗性攻击。我们提出了CodeAttack,一种简单而有效的黑盒攻击模型,利用代码结构生成有效、高效且不易察觉的对抗性代码样本,并展示了最先进的PL模型在面对代码特定对抗性攻击时的脆弱性。我们在多种编程语言上的代码-代码(翻译和修复)和代码-自然语言(摘要生成)任务中评估了CodeAttack的可迁移性。CodeAttack优于最先进的NLP对抗性攻击模型,在实现最佳整体性能下降的同时,保持了更高的效率、不易察觉性、一致性和流畅性。代码可在https://github.com/reddy-lab-code-research/CodeAttack获取。