The inference of a large symmetric signal-matrix $\mathbf{S} \in \mathbb{R}^{N\times N}$ corrupted by additive Gaussian noise, is considered for two regimes of growth of the rank $M$ as a function of $N$. For sub-linear ranks $M=\Theta(N^\alpha)$ with $\alpha\in(0,1)$ the mutual information and minimum mean-square error (MMSE) are derived for two classes of signal-matrices: (a) $\mathbf{S}=\mathbf{X}\mathbf{X}^\intercal$ with entries of $\mathbf{X}\in\mathbb{R}^{N\times M}$ independent identically distributed; (b) $\mathbf{S}$ sampled from a rotationally invariant distribution. Surprisingly, the formulas match the rank-one case. Two efficient algorithms are explored and conjectured to saturate the MMSE when no statistical-to-computational gap is present: (1) Decimation Approximate Message Passing; (2) a spectral algorithm based on a Rotation Invariant Estimator. For linear ranks $M=\Theta(N)$ the mutual information is rigorously derived for signal-matrices from a rotationally invariant distribution. Close connections with scalar inference in free probability are uncovered, which allow to deduce a simple formula for the MMSE as an integral involving the limiting spectral measure of the data matrix only. An interesting issue is whether the known information theoretic phase transitions for rank-one, and hence also sub-linear-rank, still persist in linear-rank. Our analysis suggests that only a smoothed-out trace of the transitions persists. Furthermore, the change of behavior between low and truly high-rank regimes only happens at the linear scale $\alpha=1$.
翻译:本文研究了在加性高斯噪声干扰下,对大型对称信号矩阵 $\mathbf{S} \in \mathbb{R}^{N\times N}$ 的推断问题,重点探讨秩 $M$ 随 $N$ 增长的两种机制。对于次线性秩 $M=\Theta(N^\alpha)$(其中 $\alpha\in(0,1)$),我们推导了两类信号矩阵的互信息与最小均方误差(MMSE):(a) $\mathbf{S}=\mathbf{X}\mathbf{X}^\intercal$,其中 $\mathbf{X}\in\mathbb{R}^{N\times M}$ 的条目独立同分布;(b) $\mathbf{S}$ 采样自旋转不变分布。值得注意的是,所得公式与秩为一的情况完全吻合。本文探讨了两种高效算法,并推测在不存在统计-计算间隙时,它们能达到 MMSE 下界:(1) 分割近似消息传递算法;(2) 基于旋转不变估计器的谱算法。对于线性秩 $M=\Theta(N)$,我们严格推导了旋转不变分布信号矩阵的互信息。研究揭示了其与自由概率中标量推断的紧密联系,从而推导出 MMSE 的简洁积分表达式,该表达式仅涉及数据矩阵的极限谱测度。一个值得探讨的问题是:秩为一(及次线性秩)情形中已知的信息论相变,在线性秩情形中是否仍然存在?我们的分析表明,相变仅以平滑化的痕迹形式存在。此外,低秩与真正高秩行为之间的转变仅发生在线性尺度 $\alpha=1$ 时。