KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment

Cyber threat intelligence is a critical tool that many organizations and individuals use to protect themselves from sophisticated, organized, persistent, and weaponized cyber attacks. However, few studies have focused on the quality assessment of threat intelligence provided by intelligence platforms, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose a knowledge graph-based verifier, a novel Cyber Threat Intelligence (CTI) quality assessment framework that combines knowledge graphs and Large Language Models (LLMs). Our approach introduces LLMs to automatically extract OSCTI key claims to be verified and utilizes a knowledge graph consisting of paragraphs for fact-checking. This method differs from the traditional way of constructing complex knowledge graphs with entities as nodes. By constructing knowledge graphs with paragraphs as nodes and semantic similarity as edges, it effectively enhances the semantic understanding ability of the model and simplifies labeling requirements. Additionally, to fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources. To the best of our knowledge, this work is the first to create a dataset on threat intelligence reliability verification, providing a reference for future research. Experimental results show that KGV (Knowledge Graph Verifier) significantly improves the performance of LLMs in intelligence quality assessment. Compared with traditional methods, we reduce a large amount of data annotation while the model still exhibits strong reasoning capabilities. Finally, our method can achieve XXX accuracy in network threat assessment.

翻译：网络威胁情报是众多组织与个人用以防范复杂、有组织、持续性和武器化网络攻击的关键工具。然而，现有研究鲜少关注情报平台所提供威胁情报的质量评估，此项工作仍需网络安全专家进行人工分析。本文提出一种基于知识图谱的验证器——一种融合知识图谱与大语言模型的新型网络威胁情报质量评估框架。该方法引入大语言模型自动提取待验证的开放源威胁情报关键主张，并利用由段落构成的知识图谱进行事实核查。与传统以实体为节点构建复杂知识图谱的方式不同，本方法通过以段落为节点、语义相似度为边构建知识图谱，有效增强了模型的语义理解能力并简化了标注需求。此外，为填补该研究领域的空白，我们创建并公开了首个面向异构来源的威胁情报评估数据集。据我们所知，这是首个针对威胁情报可靠性验证构建的数据集，为后续研究提供了参考。实验结果表明，知识图谱验证器显著提升了大语言模型在情报质量评估中的性能。与传统方法相比，我们在保持模型强大推理能力的同时大幅减少了数据标注量。最终，我们的方法在网络威胁评估中可实现XXX的准确率。