This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama 2, Code Llama, and OpenAI GPT large language model families, CyberSecEval effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CyberSecEval, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems.
翻译:本文提出了CyberSecEval——一个旨在提升代码辅助型大语言模型(LLMs)网络安全性的综合性基准测试。作为目前我们认为最全面的统一网络安全安全基准,CyberSecEval在两个关键安全领域对LLM进行了深入评估:其生成不安全代码的倾向性,以及在协助网络攻击请求时的合规程度。通过对Llama 2、Code Llama和OpenAI GPT三大语言模型家族的七个模型进行案例研究,CyberSecEval有效识别了关键网络安全风险,更重要的是为优化这些模型提供了实用见解。研究发现的一个显著趋势是,更先进的模型更倾向于建议不安全代码,这凸显了在开发复杂LLM时整合安全考量的紧迫性。CyberSecEval通过自动化测试用例生成与评估流程,覆盖广泛范围,为LLM设计者和研究人员提供了系统化衡量与增强LLM网络安全属性的工具,助力构建更安全的人工智能系统。