As the number and sophistication of cyber attacks have increased, threat hunting has become a critical aspect of active security, enabling proactive detection and mitigation of threats before they cause significant harm. Open-source cyber threat intelligence (OS-CTI) is a valuable resource for threat hunters, however, it often comes in unstructured formats that require further manual analysis. Previous studies aimed at automating OSCTI analysis are limited since (1) they failed to provide actionable outputs, (2) they did not take advantage of images present in OSCTI sources, and (3) they focused on on-premises environments, overlooking the growing importance of cloud environments. To address these gaps, we propose LLMCloudHunter, a novel framework that leverages large language models (LLMs) to automatically generate generic-signature detection rule candidates from textual and visual OSCTI data. We evaluated the quality of the rules generated by the proposed framework using 12 annotated real-world cloud threat reports. The results show that our framework achieved a precision of 92% and recall of 98% for the task of accurately extracting API calls made by the threat actor and a precision of 99% with a recall of 98% for IoCs. Additionally, 99.18% of the generated detection rule candidates were successfully compiled and converted into Splunk queries.
翻译:随着网络攻击数量与复杂性的增加,威胁狩猎已成为主动安全的关键环节,能够在威胁造成重大危害前实现主动检测与缓解。开源网络威胁情报(OS-CTI)是威胁猎人的宝贵资源,但其通常以非结构化格式呈现,需要进一步的人工分析。先前旨在自动化OSCTI分析的研究存在局限,因为(1)未能提供可操作的输出,(2)未充分利用OSCTI源中的图像信息,(3)侧重于本地环境,忽视了云环境日益增长的重要性。为弥补这些不足,我们提出了LLMCloudHunter——一个利用大型语言模型(LLMs)从文本与视觉OSCTI数据中自动生成通用签名检测规则候选方案的新型框架。我们使用12份已标注的真实云威胁报告评估了该框架生成规则的质量。结果表明,在准确提取威胁行为者发起的API调用任务中,我们的框架达到了92%的精确率与98%的召回率;在指标提取任务中,精确率达99%,召回率为98%。此外,99.18%的生成检测规则候选方案成功编译并转化为Splunk查询语句。