Webshell is a type of backdoor, and web applications are widely exposed to webshell injection attacks. Therefore, it is important to study webshell detection techniques. In this study, we propose a webshell detection method. We first convert PHP source code to opcodes and then extract Opcode Double-Tuples (ODTs). Next, we combine CodeBert and FastText models for feature representation and classification. To address the challenge that deep learning methods have difficulty detecting long webshell files, we introduce a sliding window attention mechanism. This approach effectively captures malicious behavior within long files. Experimental results show that our method reaches high accuracy in webshell detection, solving the problem of traditional methods that struggle to address new webshell variants and anti-detection techniques.
翻译:网页后门是一种后门程序,而Web应用广泛面临网页后门注入攻击。因此,研究网页后门检测技术具有重要意义。在本研究中,我们提出了一种网页后门检测方法。我们首先将PHP源代码转换为操作码,然后提取操作码二元组。接着,我们结合CodeBert与FastText模型进行特征表示与分类。为解决深度学习方法难以检测长网页后门文件的挑战,我们引入了滑动窗口注意力机制。该方法能有效捕捉长文件中的恶意行为。实验结果表明,我们的方法在网页后门检测中达到了较高准确率,解决了传统方法难以应对新型网页后门变种及反检测技术的问题。