The cyber terrain contains devices, network services, cyber personas, and other network entities involved in network operations. Designing a method that automatically identifies key network entities to network operations is challenging. However, such a method is essential for determining which cyber assets should the cyber defense focus on. In this paper, we propose an approach for the classification of IP addresses belonging to cyber key terrain according to their network position using the PageRank centrality computation adjusted by machine learning. We used hill climbing and random walk algorithms to distinguish PageRank's damping factors based on source and destination ports captured in IP flows. The one-time learning phase on a static data sample allows near-real-time stream-based classification of key hosts from IP flow data in operational conditions without maintaining a complete network graph. We evaluated the approach on a dataset from a cyber defense exercise and on data from the campus network. The results show that cyber key terrain identification using the adjusted computation of centrality is more precise than its original version.
翻译:网络地形包含网络操作中涉及的设备、网络服务、网络角色及其他网络实体。设计一种能够自动识别网络操作中关键网络实体的方法具有挑战性。然而,这种方法对于确定网络防御应重点关注哪些网络资产至关重要。本文提出了一种基于机器学习调整的PageRank中心性计算方法,根据网络位置对属于网络关键地形的IP地址进行分类。我们使用爬山算法和随机游走算法,根据IP流中捕获的源端口和目的端口来区分PageRank的阻尼因子。在静态数据样本上进行一次性学习阶段,使得在运行条件下无需维护完整网络图即可实现基于IP流数据的近实时流式关键主机分类。我们在网络防御演习数据集和校园网络数据上对该方法进行了评估。结果表明,使用调整后的中心性计算方法识别网络关键地形比原始版本更为精确。