LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Feature selection (FS) remains essential for building accurate and interpretable detection models, particularly in high-dimensional malware datasets. Conventional FS methods such as Extra Trees, Variance Threshold, Tree-based models, Chi-Squared tests, ANOVA, Random Selection, and Sequential Attention rely primarily on statistical heuristics or model-driven importance scores, often overlooking the semantic context of features. Motivated by recent progress in LLM-driven FS, we investigate whether large language models (LLMs) can guide feature selection in a zero-shot setting, using only feature names and task descriptions, as a viable alternative to traditional approaches. We evaluate multiple LLMs (GPT-5.0, GPT-4.0, Gemini-2.5 etc.) on the EMBOD dataset (a fusion of EMBER and BODMAS benchmark datasets), comparing them against established FS methods across several classifiers, including Random Forest, Extra Trees, MLP, and KNN. Performance is assessed using accuracy, precision, recall, F1, AUC, MCC, and runtime. Our results demonstrate that LLM-guided zero-shot feature selection achieves competitive performance with traditional FS methods while offering additional advantages in interpretability, stability, and reduced dependence on labeled data. These findings position zero-shot LLM-based FS as a promising alternative strategy for effective and interpretable malware detection, paving the way for knowledge-guided feature selection in security-critical applications

翻译：特征选择对于构建准确且可解释的检测模型至关重要，尤其是在高维恶意软件数据集中。传统的特征选择方法（如Extra Trees、方差阈值、基于树的模型、卡方检验、ANOVA、随机选择及序列注意力）主要依赖统计启发式或模型驱动的重要性评分，往往忽略特征的语义上下文。受近期大语言模型驱动特征选择进展的启发，本研究探讨大语言模型是否能在零样本设置下，仅利用特征名称和任务描述来指导特征选择，作为传统方法的可行替代方案。我们在EMBOD数据集（EMBER与BODMAS基准数据集的融合）上评估了多种大语言模型（GPT-5.0、GPT-4.0、Gemini-2.5等），并将其与多种分类器（包括随机森林、Extra Trees、MLP和KNN）上的经典特征选择方法进行比较。性能评估指标涵盖准确率、精确率、召回率、F1分数、AUC、MCC及运行时间。实验结果表明，大语言模型引导的零样本特征选择在达到与传统特征选择方法相当性能的同时，在可解释性、稳定性以及降低对标注数据的依赖性方面展现出额外优势。这些发现确立了基于零样本大语言模型的特征选择作为一种有前景的替代策略，可用于高效且可解释的恶意软件检测，为安全关键应用中知识引导的特征选择开辟了新路径。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

《基于动态图神经网络的恶意软件检测》

专知会员服务

15+阅读 · 1月28日

《利用 LLM 进行高级持续性威胁 (APT) 检测和智能解释》

专知会员服务

23+阅读 · 2025年2月14日

【NeurIPS2023】LLM 用于半自动数据科学：介绍 CAAFE，一种具有上下文感知的自动特征工程方法

专知会员服务

37+阅读 · 2023年10月3日

弹药异常检测《使用机器学习进行缺陷表征》最佳论文，MODSIM World 2023

专知会员服务

36+阅读 · 2023年7月22日