We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M = o(N^{1/10})$. Allowing for a $N$-dependent rank offers new challenges and requires new methods. Working in the Bayesian-optimal setting, we show that whenever the signal has i.i.d. entries the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M = 1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the Gaussian vector channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.
翻译:我们考虑一个高维统计模型,用于含加性高斯噪声的对称矩阵分解,其中待推断信号矩阵的秩$M$随其尺寸$N$呈亚线性增长,满足$M = o(N^{1/10})$。允许秩依赖于$N$带来了新的挑战并需要新方法。在贝叶斯最优框架下,我们证明:当信号具有独立同分布分量时,信号与数据之间的极限互信息由包含秩一复制对称势的变分公式给出。换言之,从信息论角度看,(缓慢)增长秩的情形与$M=1$(即标准尖峰Wigner模型)等价。该证明主要基于一种新颖的多尺度腔方法,允许秩随系统增长,并结合了关于高斯向量信道最恶劣噪声的信息恒等式。我们相信,所发展的腔方法将在分析自由度为大阵列而非向量的更广泛推断与自旋模型类中发挥重要作用。