This study independently reproduces the malware detection methodology presented by Felli cious et al. [7], which employs order-invariant API call frequency analysis using Random Forest classification. We utilized the original public dataset (250,533 training samples, 83,511 test samples) and replicated four model variants: Unigram, Bigram, Trigram, and Combined n gram approaches. Our reproduction successfully validated all key findings, achieving F1-scores that exceeded the original results by 0.99% to 2.57% across all models at the optimal API call length of 2,500. The Unigram model achieved F1=0.8717 (original: 0.8631), confirming its ef fectiveness as a lightweight malware detector. Across three independent experimental runs with different random seeds, we observed remarkably consistent results with standard deviations be low 0.5%, demonstrating high reproducibility. This study validates the robustness and scientific rigor of the original methodology while confirming the practical viability of frequency-based API call analysis for malware detection.
翻译:本研究独立复现了Felli cious等人[7]提出的恶意软件检测方法,该方法采用随机森林分类器进行顺序无关的API调用频率分析。我们使用原始公开数据集(250,533个训练样本,83,511个测试样本)复现了四种模型变体:Unigram、Bigram、Trigram及Combined n-gram方法。在最优API调用长度2,500的条件下,我们的复现成功验证了所有关键发现,所有模型的F1分数较原始结果提升0.99%至2.57%。其中Unigram模型取得F1=0.8717(原始结果:0.8631),证实其作为轻量级恶意软件检测器的有效性。通过使用不同随机种子进行的三次独立实验,我们观察到结果具有高度一致性(标准差低于0.5%),展现了良好的可复现性。本研究验证了原始方法的稳健性与科学严谨性,同时确认了基于频率的API调用分析在恶意软件检测中的实际可行性。