Protocol Reverse Engineering (PRE) is used to analyze protocols by inferring their structure and behavior. However, current PRE methods mainly focus on field identification within a single protocol and neglect Protocol State Machine (PSM) analysis in mixed protocol environments. This results in insufficient analysis of protocols' abnormal behavior and potential vulnerabilities, which are crucial for detecting and defending against new attack patterns. To address these challenges, we propose an automatic PSM inference framework for unknown protocols, including a fuzzy membership-based auto-converging DBSCAN algorithm for protocol format clustering, followed by a session clustering algorithm based on Needleman-Wunsch and K-Medoids algorithms to classify sessions by protocol type. Finally, we refine a probabilistic PSM algorithm to infer protocol states and the transition conditions between these states. Experimental results show that, compared with existing PRE techniques, our method can infer PSMs while enabling more precise classification of protocols.
翻译:协议逆向工程旨在通过推断协议的结构与行为来分析协议。然而,当前协议逆向工程方法主要集中于单一协议内部的字段识别,忽视了混合协议环境下的协议状态机分析。这导致对协议异常行为及潜在漏洞的分析不足,而这些对于检测和防御新型攻击模式至关重要。为应对这些挑战,本文提出一种面向未知协议的自动协议状态机推断框架,包括:基于模糊隶属度的自动收敛DBSCAN算法用于协议格式聚类;随后采用基于Needleman-Wunsch与K-Medoids算法的会话聚类方法,按协议类型对会话进行分类;最后,通过改进的概率协议状态机算法推断协议状态及状态间转移条件。实验结果表明,与现有协议逆向工程技术相比,本方法在实现更精确协议分类的同时,能够有效推断协议状态机。