PowerPeeler: A Precise and General Dynamic Deobfuscation Method for PowerShell Scripts

PowerShell is a powerful and versatile task automation tool. Unfortunately, it is also widely abused by cyber attackers. To bypass malware detection and hinder threat analysis, attackers often employ diverse techniques to obfuscate malicious PowerShell scripts. Existing deobfuscation tools suffer from the limitation of static analysis, which fails to simulate the real deobfuscation process accurately. In this paper, we propose PowerPeeler. To the best of our knowledge, it is the first dynamic PowerShell script deobfuscation approach at the instruction level. It utilizes expression-related Abstract Syntax Tree (AST) nodes to identify potential obfuscated script pieces. Then, PowerPeeler correlates the AST nodes with their corresponding instructions and monitors the script's entire execution process. Subsequently, PowerPeeler dynamically tracks the execution of these instructions and records their execution results. Finally, PowerPeeler stringifies these results to replace the corresponding obfuscated script pieces and reconstruct the deobfuscated script. To evaluate the effectiveness of PowerPeeler, we collect 1,736,669 real-world malicious PowerShell samples with diversity obfuscation methods. We compare PowerPeeler with five state-of-the-art deobfuscation tools and GPT-4. The evaluation results demonstrate that PowerPeeler can effectively handle all well-known obfuscation methods. Additionally, the deobfuscation correctness rate of PowerPeeler reaches 95%, significantly surpassing that of other tools. PowerPeeler not only recovers the highest amount of sensitive data but also maintains a semantic consistency over 97%, which is also the best. Moreover, PowerPeeler effectively obtains the largest quantity of valid deobfuscated results within a limited time frame. Furthermore, PowerPeeler is extendable and can be used as a helpful tool for other cyber security solutions.

翻译：PowerShell是一种功能强大且用途广泛的任务自动化工具。不幸的是，它也被网络攻击者广泛滥用。为了绕过恶意软件检测并阻碍威胁分析，攻击者经常采用多种技术对恶意PowerShell脚本进行混淆。现有的反混淆工具受限于静态分析，无法准确模拟真实的混淆还原过程。本文提出PowerPeeler。据我们所知，这是首个在指令级别实现的动态PowerShell脚本反混淆方法。该方法利用与表达式相关的抽象语法树（AST）节点来识别潜在的混淆脚本片段。接着，PowerPeeler将AST节点与其对应的指令关联起来，并监控脚本的完整执行过程。随后，PowerPeeler动态追踪这些指令的执行并记录其执行结果。最后，PowerPeeler将这些结果字符串化，以替换相应的混淆脚本片段并重构出反混淆后的脚本。为评估PowerPeeler的有效性，我们收集了1,736,669个采用多样化混淆方法的真实世界恶意PowerShell样本。我们将PowerPeeler与五种先进的反混淆工具以及GPT-4进行了比较。评估结果表明，PowerPeeler能有效处理所有已知的混淆方法。此外，PowerPeeler的反混淆正确率达到95%，显著优于其他工具。PowerPeeler不仅恢复了最大数量的敏感数据，同时保持了超过97%的语义一致性，这也是最佳表现。而且，PowerPeeler在有限时间内有效获得了最大数量的有效反混淆结果。此外，PowerPeeler具有良好的可扩展性，可作为其他网络安全解决方案的有力辅助工具。