Network traffic monitoring using IP flows is used to handle the current challenge of analyzing encrypted network communication. Nevertheless, the packet aggregation into flow records naturally causes information loss; therefore, this paper proposes a novel flow extension for traffic features based on the time series analysis of the Single Flow Time series, i.e., a time series created by the number of bytes in each packet and its timestamp. We propose 69 universal features based on the statistical analysis of data points, time domain analysis, packet distribution within the flow timespan, time series behavior, and frequency domain analysis. We have demonstrated the usability and universality of the proposed feature vector for various network traffic classification tasks using 15 well-known publicly available datasets. Our evaluation shows that the novel feature vector achieves classification performance similar or better than related works on both binary and multiclass classification tasks. In more than half of the evaluated tasks, the classification performance increased by up to 5\%.
翻译:利用IP流进行网络流量监控被用于应对当前加密网络通信分析的挑战。然而,将数据包聚合成流记录本质会导致信息损失;因此,本文提出了一种基于单流时间序列(即根据每个数据包的字节数及其时间戳生成的时间序列)时间序列分析的流量特征的新型流扩展方法。我们基于数据点统计分析、时域分析、流时间跨度内数据包分布、时间序列行为以及频域分析,提出了69个通用特征。我们利用15个知名的公开数据集,证明了所提特征向量在各种网络流量分类任务中的可用性和通用性。评估结果表明,在二分类和多分类任务中,该新型特征向量的分类性能与相关工作相当或更优。在超过一半的评估任务中,分类性能提升高达5%。