Malware sandboxes provide many benefits for security applications, but they are complex. These complexities can overwhelm new users in different research areas and make it difficult to select, configure, and use sandboxes. Even worse, incorrectly using sandboxes can have a negative impact on security applications. In this paper, we address this knowledge gap by systematizing 84 representative papers for using x86/64 malware sandboxes in the academic literature. We propose a novel framework to simplify sandbox components and organize the literature to derive practical guidelines for using sandboxes. We evaluate the proposed guidelines systematically using three common security applications and demonstrate that the choice of different sandboxes can significantly impact the results. Specifically, our results show that the proposed guidelines improve the sandbox observable activities by at least 1.6x and up to 11.3x. Furthermore, we observe a roughly 25% improvement in accuracy, precision, and recall when using the guidelines to help with a malware family classification task. We conclude by affirming that there is no "silver bullet" sandbox deployment that generalizes, and we recommend that users apply our framework to define a scope for their analysis, a threat model, and derive context about how the sandbox artifacts will influence their intended use case. Finally, it is important that users document their experiment, limitations, and potential solutions for reproducibility
翻译:恶意软件沙箱为安全应用提供了诸多优势,但其复杂性使得不同研究领域的新用户难以选择、配置和使用沙箱。更严重的是,错误使用沙箱会对安全应用产生负面影响。本文通过系统梳理学术文献中84篇关于x86/64恶意软件沙箱使用的代表性论文,弥补了这一知识空白。我们提出了一种新颖的框架来简化沙箱组件并组织文献,从而推导出使用沙箱的实用指南。通过三种常见安全应用的系统评估,我们证明了不同沙箱的选择会显著影响结果。具体而言,我们的结果表明,所提出的指南能将沙箱可观测活动提升至少1.6倍,最高可达11.3倍。此外,在将本指南用于恶意软件家族分类任务时,我们观察到准确率、精确率和召回率均提升了约25%。最后,我们证实不存在普适的"银弹"式沙箱部署方案,建议用户应用本框架明确分析范围、威胁模型,并推导沙箱工件对预期用例的影响背景。重要的是,用户需记录实验过程、局限性及潜在解决方案以确保可复现性。