On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance

Matrix denoising is central to signal processing and machine learning. Its analysis when the matrix to infer has a factorised structure with a rank growing proportionally to its dimension remains a challenge, except when it is rotationally invariant. In this case the information theoretic limits and a Bayes-optimal denoising algorithm, called rotational invariant estimator [1,2], are known. Beyond this setting few results can be found. The reason is that the model is not a usual spin system because of the growing rank dimension, nor a matrix model due to the lack of rotation symmetry, but rather a hybrid between the two. In this paper we make progress towards the understanding of Bayesian matrix denoising when the hidden signal is a factored matrix $XX^\intercal$ that is not rotationally invariant. Monte Carlo simulations suggest the existence of a denoising-factorisation transition separating a phase where denoising using the rotational invariant estimator remains Bayes-optimal due to universality properties of the same nature as in random matrix theory, from one where universality breaks down and better denoising is possible by exploiting the signal's prior and factorised structure, though algorithmically hard. We also argue that it is only beyond the transition that factorisation, i.e., estimating $X$ itself, becomes possible up to sign and permutation ambiguities. On the theoretical side, we combine mean-field techniques in an interpretable multiscale fashion in order to access the minimum mean-square error and mutual information. Interestingly, our alternative method yields equations which can be reproduced using the replica approach of [3]. Using numerical insights, we then delimit the portion of the phase diagram where this mean-field theory is reliable, and correct it using universality when it is not. Our ansatz matches well the numerics when accounting for finite size effects.

翻译：矩阵去噪是信号处理和机器学习的核心问题。当待推断矩阵具有因子化结构且其秩随维度成比例增长时，除旋转不变情形外，其分析仍具挑战性。在旋转不变情况下，信息论极限及一种称为旋转不变估计器[1,2]的贝叶斯最优去噪算法已有研究。但在此设定之外的结果甚少。究其原因，该模型既非常规自旋系统（因其增长秩维度），亦非标准矩阵模型（因缺乏旋转对称性），实为二者之混合体。本文针对隐藏信号为非旋转不变因子化矩阵$XX^\intercal$的贝叶斯矩阵去噪问题取得进展。蒙特卡洛模拟表明存在去噪-因子化相变：在相变一侧，基于随机矩阵理论中普适性原理，旋转不变估计器仍保持贝叶斯最优性；在相变另一侧，普适性失效，通过利用信号先验分布与因子化结构可实现更优去噪（尽管算法难度较高）。我们进一步论证，仅当超越该相变后，因子化（即估计$X$本身）才在符号与置换模糊性范围内成为可能。理论方面，我们采用可解释的多尺度均值场技术来逼近最小均方误差与互信息。值得注意的是，本方法推导的方程可通过[3]的复本方法重现。借助数值模拟，我们界定了该均值场理论可靠的相图区域，并在其失效时通过普适性原理进行修正。在考虑有限尺寸效应后，我们的理论假设与数值结果高度吻合。