PowerPeeler: A Precise and General Dynamic Deobfuscation Method for PowerShell Scripts

PowerShell is a powerful and versatile task automation tool. Unfortunately, it is also widely abused by cyber attackers. To bypass malware detection and hinder threat analysis, attackers often employ diverse techniques to obfuscate malicious PowerShell scripts. Existing deobfuscation tools suffer from the limitation of static analysis, which fails to simulate the real deobfuscation process accurately. In this paper, we propose PowerPeeler. To the best of our knowledge, it is the first dynamic PowerShell script deobfuscation approach at the instruction level. It utilizes expression-related Abstract Syntax Tree (AST) nodes to identify potential obfuscated script pieces. Then, PowerPeeler correlates the AST nodes with their corresponding instructions and monitors the script's entire execution process. Subsequently, PowerPeeler dynamically tracks the execution of these instructions and records their execution results. Finally, PowerPeeler stringifies these results to replace the corresponding obfuscated script pieces and reconstruct the deobfuscated script. To evaluate the effectiveness of PowerPeeler, we collect 1,736,669 real-world malicious PowerShell samples with diversity obfuscation methods. We compare PowerPeeler with five state-of-the-art deobfuscation tools and GPT-4. The evaluation results demonstrate that PowerPeeler can effectively handle all well-known obfuscation methods. Additionally, the deobfuscation correctness rate of PowerPeeler reaches 95%, significantly surpassing that of other tools. PowerPeeler not only recovers the highest amount of sensitive data but also maintains a semantic consistency over 97%, which is also the best. Moreover, PowerPeeler effectively obtains the largest quantity of valid deobfuscated results within a limited time frame. Furthermore, PowerPeeler is extendable and can be used as a helpful tool for other cyber security solutions.

翻译：PowerShell是一种功能强大且用途广泛的任务自动化工具。不幸的是，它也被网络攻击者广泛滥用。为了绕过恶意软件检测并阻碍威胁分析，攻击者经常采用多种技术对恶意的PowerShell脚本进行混淆。现有的反混淆工具受限于静态分析，无法准确模拟真实的混淆还原过程。本文提出PowerPeeler。据我们所知，这是首个在指令级别实现的动态PowerShell脚本反混淆方法。它利用与表达式相关的抽象语法树（AST）节点来识别潜在的混淆脚本片段。然后，PowerPeeler将这些AST节点与其对应的指令关联起来，并监控脚本的整个执行过程。随后，PowerPeeler动态跟踪这些指令的执行并记录其执行结果。最后，PowerPeeler将这些结果字符串化，以替换相应的混淆脚本片段并重构出反混淆后的脚本。为了评估PowerPeeler的有效性，我们收集了1,736,669个采用多样化混淆方法的真实世界恶意PowerShell样本。我们将PowerPeeler与五种最先进的反混淆工具以及GPT-4进行了比较。评估结果表明，PowerPeeler能够有效处理所有已知的混淆方法。此外，PowerPeeler的反混淆正确率达到95%，显著超越其他工具。PowerPeeler不仅恢复了最大量的敏感数据，而且保持了超过97%的语义一致性，这也是最优的。同时，PowerPeeler在有限时间内有效获得了最大数量的有效反混淆结果。此外，PowerPeeler具有良好的可扩展性，可作为其他网络安全解决方案的有用工具。