The recent progression of Large Language Models (LLMs) has witnessed great success in the fields of data-centric applications. LLMs trained on massive textual datasets showed ability to encode not only context but also ability to provide powerful comprehension to downstream tasks. Interestingly, Generative Pre-trained Transformers utilised this ability to bring AI a step closer to human being replacement in at least datacentric applications. Such power can be leveraged to identify anomalies of cyber threats, enhance incident response, and automate routine security operations. We provide an overview for the recent activities of LLMs in cyber defence sections, as well as categorization for the cyber defence sections such as threat intelligence, vulnerability assessment, network security, privacy preserving, awareness and training, automation, and ethical guidelines. Fundamental concepts of the progression of LLMs from Transformers, Pre-trained Transformers, and GPT is presented. Next, the recent works of each section is surveyed with the related strengths and weaknesses. A special section about the challenges and directions of LLMs in cyber security is provided. Finally, possible future research directions for benefiting from LLMs in cyber security is discussed.
翻译:近年来,大型语言模型(LLMs)在数据为中心的应用领域取得了巨大成功。基于海量文本数据集训练的LLMs不仅展现出编码上下文的能力,还能为下游任务提供强大的理解能力。值得注意的是,生成式预训练Transformer模型利用这种能力,使人工智能在至少以数据为中心的应用中向替代人类迈进了一步。这种能力可被用于识别网络威胁异常、增强事件响应以及自动化常规安全操作。本文综述了LLMs在网络安全防御领域的最新进展,并对网络防御领域进行了分类,包括威胁情报、漏洞评估、网络安全、隐私保护、意识与培训、自动化以及伦理准则。文中阐述了从Transformer、预训练Transformer到GPT的LLMs演进基本概念。随后,对各领域的最新研究成果进行了调研,并分析了相关优势与不足。专设章节探讨了LLMs在网络安全领域面临的挑战与发展方向。最后,讨论了未来利用LLMs提升网络安全水平的潜在研究方向。