Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 571 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks and formulate a number of concrete recommendations for improving the evaluations reported in future research papers.
翻译:终端攻击者(MATE)能完全掌控被攻击软件运行的系统,试图破坏嵌入软件中的资产机密性或完整性。无论是企业还是恶意软件作者都希望防范此类攻击。这催生了攻击者与防御者之间的技术竞赛,衍生出大量不同类型的保护与分析手段。然而,由于MATE攻击者可通过多种途径达成目标,且学界尚无普遍认可的评估方法论,保护强度的量化评估仍面临巨大挑战。本综述系统性地梳理了混淆技术(针对MATE攻击的主要防护手段)相关论文中的评估方法论。我们针对571篇论文,从样本集类型与规模、样本处理方式到测量指标等维度,收集了113项评估方法论特征,深入揭示了当前学术界在评估保护技术及分析方法上的最新实践。总体而言,现有评估方法论亟待改进。我们识别出软件保护评估面临的九项挑战,这些挑战威胁着MATE攻击场景下研究结论的有效性、可复现性及可解释性,并针对未来研究论文中的评估报告提出了若干具体改进建议。