Predicting web service traffic has significant social value, as it can be applied to various practical scenarios, including but not limited to dynamic resource scaling, load balancing, system anomaly detection, service-level agreement compliance, and fraud detection. Web service traffic is characterized by frequent and drastic fluctuations over time and are influenced by heterogeneous web user behaviors, making accurate prediction a challenging task. Previous research has extensively explored statistical approaches, and neural networks to mine features from preceding service traffic time series for prediction. However, these methods have largely overlooked the causal relationships between services. Drawing inspiration from causality in ecological systems, we empirically recognize the causal relationships between web services. To leverage these relationships for improved web service traffic prediction, we propose an effective neural network module, CCMPlus, designed to extract causal relationship features across services. This module can be seamlessly integrated with existing time series models to consistently enhance the performance of web service traffic predictions. We theoretically justify that the causal correlation matrix generated by the CCMPlus module captures causal relationships among services. Empirical results on real-world datasets from Microsoft Azure, Alibaba Group, and Ant Group confirm that our method surpasses state-of-the-art approaches in Mean Squared Error (MSE) and Mean Absolute Error (MAE) for predicting service traffic time series. These findings highlight the efficacy of leveraging causal relationships for improved predictions.
翻译:网络服务流量预测具有重要的社会价值,可应用于多种实际场景,包括但不限于动态资源扩展、负载均衡、系统异常检测、服务等级协议合规性监测以及欺诈检测。网络服务流量具有随时间频繁剧烈波动的特性,且受异构网络用户行为的影响,这使得精准预测成为一项具有挑战性的任务。先前研究已广泛探索了统计方法与神经网络,旨在从先前的服务流量时间序列中挖掘特征以进行预测。然而,这些方法在很大程度上忽视了服务间的因果关系。受生态系统因果关系的启发,我们通过实证识别了网络服务间的因果关系。为利用这些关系以改进网络服务流量预测,我们提出了一种有效的神经网络模块CCMPlus,旨在提取跨服务的因果关系特征。该模块可与现有时间序列模型无缝集成,持续提升网络服务流量预测的性能。我们从理论上证明了CCMPlus模块生成的因果关联矩阵能够捕捉服务间的因果关系。在来自Microsoft Azure、阿里巴巴集团和蚂蚁集团的真实数据集上的实证结果表明,在预测服务流量时间序列时,我们的方法在均方误差(MSE)和平均绝对误差(MAE)方面均优于当前最先进的方法。这些发现凸显了利用因果关系提升预测效果的有效性。