Traffic matrix measurement is fundamental for datacenter operations, but obtaining complete traffic matrices at scale remains challenging due to the prohibitive cost of global fine-grained measurement and partial observations resulting from network faults. Although existing matrix completion methods (reduce cost) achieve satisfactory performance in specific scenarios, their reliance on restrictive assumptions or black-box mappings results in a lack of interpretability and an inability to characterize uncertainty. In this paper, we propose Utimac, an uncertainty-aware traffic matrix completion for data center networks. Our analysis shows that, within a locally stationary window, log-domain traffic can be decomposed into a principal statistical component and a sparse deviation component. Based on this insight, we formulate traffic matrix completion as a parameter inference problem: multiple partially observed frames within a window are used to infer shared parameters and recover missing entries. To avoid the intractability and boundary degeneracy of the original integral-form marginal likelihood, we construct a regularized surrogate objective and solve the resulting joint optimization problem with block coordinate descent. Utimac consistently outperforms all baselines on data center networks datasets in both overall and burst scenarios, with its advantage becoming more pronounced as observations grow sparser. All code is publicly available in an anonymous repository: https://anonymous.4open.science/r/Utimac-0551/
翻译:流量矩阵测量是数据中心运营的基础,但由于全局细粒度测量的高昂成本以及网络故障导致的局部观测,大规模获取完整流量矩阵仍具挑战性。尽管现有的矩阵补全方法(降低成本)在特定场景下取得了令人满意的性能,但其依赖于限制性假设或黑盒映射,导致缺乏可解释性且无法表征不确定性。本文提出Utimac,一种面向数据中心网络的不确定性感知流量矩阵补全方法。分析表明,在局部平稳窗口内,对数域流量可分解为主统计分量与稀疏偏差分量。基于此发现,我们将流量矩阵补全建模为参数推断问题:利用窗口内多个部分观测帧推断共享参数并恢复缺失条目。为避免原始积分形式边际似然函数的难解性与边界退化问题,我们构造了正则化替代目标,并通过块坐标下降法求解联合优化问题。在数据中心网络数据集上,Utimac在全局场景与突发场景下均持续优于所有基线方法,且随着观测稀疏性增加其优势愈发显著。全部代码已在匿名仓库公开:https://anonymous.4open.science/r/Utimac-0551/