In this paper, we investigate the feature encoding process in a prototypical energy-based generative model, the Restricted Boltzmann Machine (RBM). We start with an analytical investigation using simplified architectures and data structures, and end with numerical analysis of real trainings on real datasets. Our study tracks the evolution of the model's weight matrix through its singular value decomposition, revealing a series of phase transitions associated to a progressive learning of the principal modes of the empirical probability distribution. The model first learns the center of mass of the modes and then progressively resolve all modes through a cascade of phase transitions. We first describe this process analytically in a controlled setup that allows us to study analytically the training dynamics. We then validate our theoretical results by training the Bernoulli-Bernoulli RBM on real data sets. By using data sets of increasing dimension, we show that learning indeed leads to sharp phase transitions in the high-dimensional limit. Moreover, we propose and test a mean-field finite-size scaling hypothesis. This shows that the first phase transition is in the same universality class of the one we studied analytically, and which is reminiscent of the mean-field paramagnetic-to-ferromagnetic phase transition.
翻译:本文研究了受限玻尔兹曼机(RBM)这一典型基于能量的生成模型中的特征编码过程。我们从简化架构和数据结构的解析分析入手,最终对真实数据集上的实际训练进行数值分析。通过跟踪模型权重矩阵奇异值分解的演化过程,我们发现了一系列与经验概率分布主模态渐进学习相关的相变。模型首先学习模态的质心,随后通过级联相变逐步解析所有模态。我们首先在可控设定下解析描述该过程,从而实现对训练动力学的解析研究。随后通过在真实数据集上训练伯努利-伯努利RBM验证了理论结果。通过使用维度递增的数据集,我们证明在高维极限下学习确实会导致尖锐的相变。此外,我们提出并检验了平均场有限尺寸标度假说,结果表明首次相变与解析研究的相变属于同一普适类,这令人联想到平均场顺磁-铁磁相变。