This paper explores the relatively underexplored application of Positive Unlabeled (PU) Learning and Negative Unlabeled (NU) Learning in the cybersecurity domain. While these semi-supervised learning methods have been applied successfully in fields like medicine and marketing, their potential in cybersecurity remains largely untapped. The paper identifies key areas of cybersecurity--such as intrusion detection, vulnerability management, malware detection, and threat intelligence--where PU/NU learning can offer significant improvements, particularly in scenarios with imbalanced or limited labeled data. We provide a detailed problem formulation for each subfield, supported by mathematical reasoning, and highlight the specific challenges and research gaps in scaling these methods to real-time systems, addressing class imbalance, and adapting to evolving threats. Finally, we propose future directions to advance the integration of PU/NU learning in cybersecurity, offering solutions that can better detect, manage, and mitigate emerging cyber threats.
翻译:本文探讨了正样本未标记学习与负样本未标记学习在网络安全领域中相对未被充分探索的应用。尽管这些半监督学习方法已在医学和市场营销等领域成功应用,但它们在网络安全领域的潜力仍未得到充分开发。本文识别了网络安全的关键领域——如入侵检测、漏洞管理、恶意软件检测和威胁情报——在这些领域中,PU/NU学习能够带来显著改进,尤其是在标注数据不平衡或有限的场景下。我们为每个子领域提供了详细的问题形式化描述,并辅以数学推理,同时强调了将这些方法扩展到实时系统、处理类别不平衡问题以及适应不断演变的威胁时所面临的具体挑战和研究空白。最后,我们提出了推动PU/NU学习与网络安全融合的未来方向,提供了能够更好地检测、管理和缓解新兴网络威胁的解决方案。