This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.
翻译:本研究分析了利用大型语言模型检测域名生成算法的方法。我们对两种关键技术进行了详细评估:上下文学习与监督微调,展示了它们如何提升检测性能。监督微调通过使用领域特定数据来提高性能,而上下文学习则帮助检测模型快速适应新型威胁,无需大量重新训练。我们采用Meta的Llama3 8B模型,在包含68个恶意软件家族与正常域名的定制数据集上进行实验,覆盖了多种难以检测的生成方案(包括近期基于词汇的域名生成算法)。结果表明,基于大型语言模型的方法在域名生成算法检测中能够取得具有竞争力的效果。特别值得注意的是,基于监督微调的大型语言模型检测器在采用注意力层的最新模型中表现优异,实现了94%的准确率与4%的误报率,并在检测基于词汇的域名生成算法域方面表现突出。