As large language models (LLMs) move from research prototypes to enterprise systems, their security vulnerabilities pose serious risks to data privacy and system integrity. This study benchmarks various Llama model variants against the OWASP Top 10 for LLM Applications framework, evaluating threat detection accuracy, response safety, and computational overhead. Using the FABRIC testbed with NVIDIA A30 GPUs, we tested five standard Llama models and five Llama Guard variants on 100 adversarial prompts covering ten vulnerability categories. Our results reveal significant differences in security performance: the compact Llama-Guard-3-1B model achieved the highest detection rate of 76% with minimal latency (0.165s per test), whereas base models such as Llama-3.1-8B failed to detect threats (0% accuracy) despite longer inference times (0.754s). We observe an inverse relationship between model size and security effectiveness, suggesting that smaller, specialized models often outperform larger general-purpose ones in security tasks. Additionally, we provide an open-source benchmark dataset including adversarial prompts, threat labels, and attack metadata to support reproducible research in AI security, [1].
翻译:随着大型语言模型从研究原型转向企业级系统,其安全漏洞对数据隐私和系统完整性构成严重风险。本研究基于OWASP Top 10 for LLM Applications框架,对各种Llama模型变体进行安全基准测试,评估其威胁检测准确率、响应安全性和计算开销。利用配备NVIDIA A30 GPU的FABRIC测试平台,我们在覆盖十类漏洞的100个对抗性提示上测试了五个标准Llama模型和五个Llama Guard变体。结果显示安全性能存在显著差异:紧凑型Llama-Guard-3-1B模型以最低延迟(每次测试0.165秒)实现了76%的最高检测率,而基础模型如Llama-3.1-8B尽管推理时间更长(0.754秒),却未能检测到威胁(准确率0%)。我们观察到模型规模与安全效能之间存在反比关系,表明在安全任务中,较小规模的专业化模型往往优于较大的通用模型。此外,我们提供了一个包含对抗性提示、威胁标签和攻击元数据的开源基准数据集,以支持人工智能安全领域的可重复研究,[1]。