Open-source ecosystems such as NPM and PyPI are increasingly targeted by supply chain attacks, yet existing detection methods either depend on fragile handcrafted rules or data-driven features that fail to capture evolving attack semantics. We present IntelGuard, a retrieval-augmented generation (RAG) based framework that integrates expert analytical reasoning into automated malicious package detection. IntelGuard constructs a structured knowledge base from over 8,000 threat intelligence reports, linking malicious code snippets with behavioral descriptions and expert reasoning. When analyzing new packages, it retrieves semantically similar malicious examples and applies LLM-guided reasoning to assess whether code behaviors align with intended functionality. Experiments on 4,027 real-world packages show that IntelGuard achieves 99% accuracy and a 0.50% false positive rate, while maintaining 96.5% accuracy on obfuscated code. Deployed on PyPI.org, it discovered 54 previously unreported malicious packages, demonstrating interpretable and robust detection guided by expert knowledge.
翻译:NPM与PyPI等开源生态系统日益成为软件供应链攻击的目标,然而现有检测方法要么依赖脆弱的手工规则,要么采用无法捕捉动态攻击语义的数据驱动特征。本文提出IntelGuard——一个基于检索增强生成(RAG)的框架,将专家分析推理融入自动化恶意软件包检测体系。该框架从超过8000份威胁情报报告中构建结构化知识库,将恶意代码片段与行为描述及专家推理进行关联。在分析新软件包时,系统会检索语义相似的恶意样本,并运用LLM引导的推理机制来评估代码行为是否符合预期功能。对4027个真实软件包的实验表明,IntelGuard实现了99%的检测准确率与0.50%的误报率,在混淆代码检测中仍保持96.5%的准确率。在PyPI.org的实际部署中,该系统发现了54个先前未报告的恶意软件包,验证了专家知识引导下可解释且鲁棒的检测能力。