Audit logs containing system level events are frequently used for behavior modeling as they can provide detailed insight into cyber-threat occurrences. However, mapping low-level system events in audit logs to highlevel behaviors has been a major challenge in identifying host contextual behavior for the purpose of detecting potential cyber threats. Relying on domain expert knowledge may limit its practical implementation. This paper presents TapTree, an automated process-tree based technique to extract host behavior by compiling system events' semantic information. After extracting behaviors as system generated process trees, TapTree integrates event semantics as a representation of behaviors. To further reduce pattern matching workloads for the analyst, TapTree aggregates semantically equivalent patterns and optimizes representative behaviors. In our evaluation against a recent benchmark audit log dataset (DARPA OpTC), TapTree employs tree pattern queries and sequential pattern mining techniques to deduce the semantics of connected system events, achieving high accuracy for behavior abstraction and then Advanced Persistent Threat (APT) attack detection. Moreover, we illustrate how to update the baseline model gradually online, allowing it to adapt to new log patterns over time.
翻译:包含系统级事件的审计日志常被用于行为建模,因为它们能提供网络威胁事件的详细洞察。然而,将审计日志中的低级系统事件映射到高级行为,一直是识别主机上下文行为以检测潜在网络威胁的主要挑战。依赖领域专家知识可能限制其实际应用。本文提出TapTree,一种基于进程树的自动化技术,通过编译系统事件的语义信息来提取主机行为。在将行为提取为系统生成的进程树后,TapTree整合事件语义作为行为的表征。为进一步减少分析人员的模式匹配工作量,TapTree聚合语义等价模式并优化代表性行为。在我们针对最新基准审计日志数据集(DARPA OpTC)的评估中,TapTree采用树模式查询和序列模式挖掘技术推导连接系统事件的语义,在行为抽象上实现高精度,进而完成高级持续性威胁(APT)攻击检测。此外,我们展示了如何逐步在线更新基线模型,使其能够随时间适应新的日志模式。