Large Language Models (LLMs) represent an advanced evolution of earlier, simpler language models. They boast enhanced abilities to handle complex language patterns and generate coherent text, images, audios, and videos. Furthermore, they can be fine-tuned for specific tasks. This versatility has led to the proliferation and extensive use of numerous commercialized large models. However, the rapid expansion of LLMs has raised security and ethical concerns within the academic community. This emphasizes the need for ongoing research into security evaluation during their development and deployment. Over the past few years, a substantial body of research has been dedicated to the security evaluation of large-scale models. This article an in-depth review of the most recent advancements in this field, providing a comprehensive analysis of commonly used evaluation metrics, advanced evaluation frameworks, and the routine evaluation processes for LLMs. Furthermore, we also discuss the future directions for advancing the security evaluation of LLMs.
翻译:大型语言模型(LLMs)代表了早期简单语言模型的先进演进。它们具备增强的能力,能够处理复杂的语言模式并生成连贯的文本、图像、音频和视频。此外,它们还可以针对特定任务进行微调。这种多功能性导致了许多商业化大模型的激增和广泛应用。然而,LLMs的快速扩展引发了学术界对其安全性和伦理问题的担忧。这强调了在开发和部署过程中对安全评估进行持续研究的必要性。在过去几年中,大量研究致力于大规模模型的安全评估。本文深入回顾了该领域的最新进展,全面分析了LLMs常用的评估指标、先进的评估框架以及常规评估流程。此外,我们还讨论了推进LLMs安全评估的未来方向。