When can the input of a ReLU neural network be inferred from its output? In other words, when is the network injective? We consider a single layer, $x \mapsto \mathrm{ReLU}(Wx)$, with a random Gaussian $m \times n$ matrix $W$, in a high-dimensional setting where $n, m \to \infty$. Recent work connects this problem to spherical integral geometry giving rise to a conjectured sharp injectivity threshold for $\alpha = \frac{m}{n}$ by studying the expected Euler characteristic of a certain random set. We adopt a different perspective and show that injectivity is equivalent to a property of the ground state of the spherical perceptron, an important spin glass model in statistical physics. By leveraging the (non-rigorous) replica symmetry-breaking theory, we derive analytical equations for the threshold whose solution is at odds with that from the Euler characteristic. Furthermore, we use Gordon's min--max theorem to prove that a replica-symmetric upper bound refutes the Euler characteristic prediction. Along the way we aim to give a tutorial-style introduction to key ideas from statistical physics in an effort to make the exposition accessible to a broad audience. Our analysis establishes a connection between spin glasses and integral geometry but leaves open the problem of explaining the discrepancies.
翻译:何时能够从ReLU神经网络的输出推断其输入?换言之,网络何时具有单射性?我们考虑单层网络 $x \mapsto \mathrm{ReLU}(Wx)$,其中 $W$ 为随机高斯 $m \times n$ 矩阵,并研究高维设定下 $n, m \to \infty$ 的情况。近期研究通过分析特定随机集的期望欧拉特征,将该问题与球面积分几何相联系,并由此提出了关于 $\alpha = \frac{m}{n}$ 的尖锐单射性阈值猜想。本文采用不同视角,证明单射性等价于球面感知机基态的某种性质——该模型是统计物理学中重要的自旋玻璃模型。通过运用(非严格)的复本对称破缺理论,我们推导出阈值的解析方程,其解与欧拉特征方法所得结果存在矛盾。此外,我们利用Gordon极小-极大定理证明,复本对称上界否定了欧拉特征的预测结果。在论述过程中,我们以教程式风格介绍统计物理学的核心概念,力求使阐述能被广泛读者理解。本研究建立了自旋玻璃与积分几何之间的关联,但二者差异的成因仍有待探索。