Managing the threat posed by malware requires accurate detection and classification techniques. Traditional detection strategies, such as signature scanning, rely on manual analysis of malware to extract relevant features, which is labor intensive and requires expert knowledge. Function call graphs consist of a set of program functions and their inter-procedural calls, providing a rich source of information that can be leveraged to classify malware without the labor intensive feature extraction step of traditional techniques. In this research, we treat malware classification as a graph classification problem. Based on Local Degree Profile features, we train a wide range of Graph Neural Network (GNN) architectures to generate embeddings which we then classify. We find that our best GNN models outperform previous comparable research involving the well-known MalNet-Tiny Android malware dataset. In addition, our GNN models do not suffer from the overfitting issues that commonly afflict non-GNN techniques, although GNN models require longer training times.
翻译:管理恶意软件威胁需要精确的检测与分类技术。传统检测策略(如签名扫描)依赖人工分析恶意软件以提取相关特征,既耗费人力又需要专家知识。函数调用图由程序函数集及其过程间调用关系构成,提供了丰富的信息源,可在无需传统技术中高人力成本特征提取步骤的前提下,用于恶意软件分类。本研究将恶意软件分类视为图分类问题,基于局部度分布特征训练多种图神经网络(GNN)架构以生成嵌入向量,进而完成分类任务。我们发现最优GNN模型性能超越此前使用知名MalNet-Tiny安卓恶意软件数据集的同类研究。此外,尽管GNN模型需要更长的训练时间,但不会出现非GNN技术常见的过拟合问题。