We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime, where the rank of the signal matrix to infer $M$ scales with its size $N$ as $M=\mathrm{o}(\sqrt{\ln N})$. Allowing for an $N$-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d. entries, the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M=1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the vector Gaussian channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.
翻译:我们考虑了高维统计框架下带有加性高斯噪声的对称矩阵分解统计模型,其中待推断信号矩阵的秩$M$随其尺寸$N$增长满足$M=\mathrm{o}(\sqrt{\ln N})$。允许秩依赖于$N$带来了新的挑战,需要新的方法。在贝叶斯最优设定下,我们证明:当信号具有独立同分布元素时,信号与数据之间的极限互信息可由涉及秩一复制对称势的变分公式给出。换言之,从信息论角度看,(缓慢)增长秩的情形与$M=1$(即标准尖峰Wigner模型)相同。该证明主要基于一种新颖的多尺度空穴法,该方法允许秩随系统增长,并结合了向量高斯信道最差噪声的信息恒等式。我们相信,本文发展的空穴法将在分析更广泛的推理和自旋模型中发挥作用,此类模型的自由度是大型阵列而非向量。