Knowledge sharing about emerging threats is crucial in the rapidly advancing field of cybersecurity and forms the foundation of Cyber Threat Intelligence. In this context, Large Language Models are becoming increasingly significant in the field of cybersecurity, presenting a wide range of opportunities. This study explores the capability of chatbots such as ChatGPT, GPT4all, Dolly,Stanford Alpaca, Alpaca-LoRA, and Falcon to identify cybersecurity-related text within Open Source Intelligence. We assess the capabilities of existing chatbot models for Natural Language Processing tasks. We consider binary classification and Named Entity Recognition as tasks. This study analyzes well-established data collected from Twitter, derived from previous research efforts. Regarding cybersecurity binary classification, Chatbot GPT-4 as a commercial model achieved an acceptable F1-score of 0.94, and the open-source GPT4all model achieved an F1-score of 0.90. However, concerning cybersecurity entity recognition, chatbot models have limitations and are less effective. This study demonstrates the capability of these chatbots only for specific tasks, such as cybersecurity binary classification, while highlighting the need for further refinement in other tasks, such as Named Entity Recognition tasks.
翻译:在快速发展的网络安全领域,关于新兴威胁的知识共享至关重要,构成了网络威胁情报的基础。在此背景下,大语言模型在网络安全领域正变得日益重要,展现出广泛的应用前景。本研究探讨了ChatGPT、GPT4all、Dolly、Stanford Alpaca、Alpaca-LoRA和Falcon等聊天机器人识别开源情报中网络安全相关文本的能力。我们评估了现有聊天机器人模型在自然语言处理任务中的性能,将二分类和命名实体识别作为具体任务。本研究分析了来自Twitter的成熟数据集,该数据集源自先前的研究工作。在网络安全二分类方面,商业模型Chatbot GPT-4取得了可接受的0.94的F1分数,开源模型GPT4all取得了0.90的F1分数。然而,在网络安全实体识别方面,聊天机器人模型存在局限性,效果欠佳。本研究证明了这些聊天机器人仅在特定任务(如网络安全二分类)上的能力,同时强调了在其他任务(如命名实体识别任务)上需要进一步优化的必要性。