Defending community-owned cyber space requires community-based efforts. Large-scale network observations that uphold the highest regard for privacy are key to protecting our shared cyberspace. Deployment of the necessary network sensors requires careful sensor placement, focusing, and calibration with significant volumes of network observations. This paper demonstrates novel focusing and calibration procedures on a multi-billion packet dataset using high-performance GraphBLAS anonymized hypersparse matrices. The run-time performance on a real-world data set confirms previously observed real-time processing rates for high-bandwidth links while achieving significant data compression. The output of the analysis demonstrates the effectiveness of these procedures at focusing the traffic matrix and revealing the underlying stable heavy-tail statistical distributions that are necessary for anomaly detection. A simple model of the corresponding probability of detection ($p_{\rm d}$) and probability of false alarm ($p_{\rm fa}$) for these distributions highlights the criticality of network sensor focusing and calibration. Once a sensor is properly focused and calibrated it is then in a position to carry out two of the central tenets of good cybersecurity: (1) continuous observation of the network and (2) minimizing unbrokered network connections.
翻译:保护社区所有的网络空间需要基于社区的努力。秉持最高隐私保护标准的大规模网络观测,是保护我们共享网络空间的关键。部署必要的网络传感器需要精心选择传感器位置、进行聚焦并利用大量网络观测数据进行校准。本文在数十亿数据包的数据集上,采用高性能GraphBLAS匿名超稀疏矩阵,演示了新颖的聚焦与校准流程。在真实世界数据集上的运行时性能证实了此前观察到的高带宽链路实时处理速率,同时实现了显著的数据压缩。分析结果表明,这些流程能有效聚焦流量矩阵,并揭示出异常检测所必需的稳定重尾统计分布。针对这些分布,一个简单的检测概率($p_{\rm d}$)与虚警概率($p_{\rm fa}$)对应模型凸显了网络传感器聚焦与校准的关键性。一旦传感器得到正确聚焦与校准,它就能执行良好网络安全的两个核心原则:(1) 持续观测网络;(2) 最小化未代理的网络连接。