We observed the Array Canary, a novel JavaScript anti-analysis technique currently exploited in-the-wild by the Phishing-as-a-Service framework Darcula. The Array Canary appears to be an advanced form of the array shuffling techniques employed by the Emotet JavaScript downloader. In practice, a series of Array Canaries are set within a string array and if modified will cause the program to endlessly loop. In this paper, we demonstrate how an Array Canary works and discuss Autonomous Function Call Resolution (AFCR), which is a method we created to bypass Array Canaries. We also introduce Arphsy, a proof-of-concept for AFCR designed to guide Large Language Models and security researchers in the deobfuscation of "canaried" JavaScript code. We accomplish this by (i) Finding and extracting all Immediately Invoked Function Expressions from a canaried file, (ii) parsing the file's Abstract Syntax Tree for any function that does not implement imported function calls, (iii) identifying the most reassigned variable and its corresponding function body, (iv) calculating the length of the largest string array and uses it to determine the offset values within the canaried file, (v) aggregating all the previously identified functions into a single file, and (vi) appending driver code into the verified file and using it to deobfuscate the canaried file.
翻译:我们观察到一种新型JavaScript反分析技术——数组金丝雀,该技术目前正被钓鱼即服务平台Darcula在真实攻击中利用。数组金丝雀似乎是Emotet JavaScript下载器所用数组重排技术的高级形式。在实际应用中,一系列数组金丝雀被设置在字符串数组中,若被修改将导致程序陷入无限循环。本文阐述了数组金丝雀的工作原理,并提出了我们设计的绕过方法——自主函数调用解析。同时我们介绍了概念验证工具Arphsy,该工具旨在指导大型语言模型和安全研究人员对"金丝雀化"JavaScript代码进行反混淆。我们通过以下步骤实现:(i)从金丝雀化文件中查找并提取所有立即调用函数表达式,(ii)解析文件的抽象语法树以识别未实现导入函数调用的函数,(iii)定位被重复赋值最频繁的变量及其对应函数体,(iv)计算最大字符串数组长度以确定金丝雀化文件内的偏移值,(v)将所有已识别函数聚合至单个文件,(vi)向验证文件追加驱动代码并用于反混淆金丝雀化文件。