超越旋转不变性的扩展秩对称矩阵去噪相图研究 (On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance)

Matrix denoising is central to signal processing and machine learning. Its statistical analysis when the matrix to infer has a factorised structure with a rank growing proportionally to its dimension remains a challenge, except when it is rotationally invariant. In this case the information theoretic limits and an efficient Bayes-optimal denoising algorithm, called rotational invariant estimator [1,2], are known. Beyond this setting few results can be found. The reason is that the model is not a usual spin system because of the growing rank dimension, nor a matrix model (as appearing in high-energy physics) due to the lack of rotation symmetry, but rather a hybrid between the two. Here we make progress towards the understanding of Bayesian matrix denoising when the signal is a factored matrix $XX^\intercal$ that is not rotationally invariant. Monte Carlo simulations suggest the existence of a \emph{denoising-factorisation transition} separating a phase where denoising using the rotational invariant estimator remains Bayes-optimal due to universality properties of the same nature as in random matrix theory, from one where universality breaks down and better denoising is possible, though algorithmically hard. We argue that it is only beyond the transition that factorisation, i.e., estimating $X$ itself, becomes possible up to irresolvable ambiguities. On the theory side, we combine mean-field techniques in an interpretable multiscale fashion in order to access the minimum mean-square error and mutual information. Interestingly, our alternative method yields equations reproducible by the replica approach of [3]. Using numerical insights, we delimit the portion of phase diagram where we conjecture the mean-field theory to be exact, and correct it using universality when it is not. Our complete ansatz matches well the numerics in the whole phase diagram when considering finite size effects.

翻译：矩阵去噪是信号处理和机器学习的核心问题。当待推断矩阵具有因子化结构且秩随维度成比例增长时，其统计分析仍具挑战性，除非该矩阵具有旋转不变性。在旋转不变情形下，信息论极限及一种高效的贝叶斯最优去噪算法——称为旋转不变估计器[1,2]——已为所知。超越此设定之外的研究结果甚少。其根源在于：由于增长秩维度，该模型既非常规自旋系统；又因缺乏旋转对称性，亦非高能物理中出现的矩阵模型，实为二者的混合体。本文针对信号为非旋转不变因子化矩阵$XX^\intercal$的贝叶斯矩阵去噪问题取得研究进展。蒙特卡洛模拟表明存在一种**去噪-因子化相变**，该相变将两个区域分隔：在其中一个区域，基于随机矩阵理论中普遍性原理的旋转不变估计器仍保持贝叶斯最优性；而在另一区域，普遍性原理失效，存在算法上困难但理论上更优的去噪可能。我们论证仅当超越该相变点后，因子化（即直接估计$X$本身）才在不可消除的模糊度范围内成为可能。在理论层面，我们以可解释的多尺度方式结合平均场技术，以求解最小均方误差与互信息。值得注意的是，我们的替代方法所得方程可通过[3]的复本方法重现。借助数值模拟的启示，我们界定了平均场理论可精确适用的相图区域，并在其失效时通过普遍性原理进行修正。当考虑有限尺寸效应时，我们完整的理论猜想在整个相图中与数值结果高度吻合。