Understanding cyber security is increasingly important for individuals and organizations. However, a lot of information related to cyber security can be difficult to understand to those not familiar with the topic. In this study, we focus on investigating how large language models (LLMs) could be utilized in automatic text simplification (ATS) of Common Vulnerability and Exposure (CVE) descriptions. Automatic text simplification has been studied in several contexts, such as medical, scientific, and news texts, but it has not yet been studied to simplify texts in the rapidly changing and complex domain of cyber security. We created a baseline for cyber security ATS and a test dataset of 40 CVE descriptions, evaluated by two groups of cyber security experts in two survey rounds. We have found that while out-of-the box LLMs can make the text appear simpler, they struggle with meaning preservation. Code and data are available at https://version.aalto.fi/gitlab/vehomav1/simplification\_nmi.
翻译:理解网络安全对个人和组织日益重要。然而,大量网络安全相关信息对于不熟悉该领域的人员而言可能难以理解。本研究聚焦于探索如何利用大语言模型(LLMs)实现通用漏洞与暴露(CVE)描述的自动文本简化(ATS)。自动文本简化已在医疗、科学及新闻文本等多个领域得到研究,但在快速演变且复杂的网络安全领域中,尚未有研究探讨如何简化相关文本。我们构建了网络安全ATS的基线模型及包含40条CVE描述的测试数据集,并通过两轮调查由两组网络安全专家进行评估。研究发现,尽管未经专门训练的LLMs能使文本表面更简洁,但其在语义保持方面仍存在不足。代码与数据可在 https://version.aalto.fi/gitlab/vehomav1/simplification_nmi 获取。