Restricted Boltzmann Machines (RBMs) are generative models designed to learn from data with a rich underlying structure. In this work, we explore a teacher-student setting where a student RBM learns from examples generated by a teacher RBM, with a focus on the effect of the unit priors on learning efficiency. We consider a parametric class of priors that interpolate between continuous (Gaussian) and binary variables. This approach models various possible choices of visible units, hidden units, and weights for both the teacher and student RBMs. By analyzing the phase diagram of the posterior distribution in both the Bayes optimal and mismatched regimes, we demonstrate the existence of a triple point that defines the critical dataset size necessary for learning through generalization. The critical size is strongly influenced by the properties of the teacher, and thus the data, but is unaffected by the properties of the student RBM. Nevertheless, a prudent choice of student priors can facilitate training by expanding the so-called signal retrieval region, where the machine generalizes effectively.
翻译:受限玻尔兹曼机(RBMs)是一种生成模型,旨在从具有丰富底层结构的数据中学习。在本研究中,我们探讨了一种师生学习场景:学生RBM从教师RBM生成的样本中学习,重点关注单元先验分布对学习效率的影响。我们考虑了一类参数化的先验分布,其在连续(高斯)变量与二元变量之间进行插值。该方法模拟了教师和学生RBM在可见单元、隐藏单元及权重参数上多种可能的选择。通过分析贝叶斯最优与失配两种情况下后验分布的相图,我们证明了一个三重点的存在,该点定义了通过泛化进行学习所需的关键数据集规模。这一关键规模受教师(即数据)特性的强烈影响,但不受学生RBM特性的影响。尽管如此,谨慎选择学生先验分布可通过扩展所谓的信号恢复区域(即机器能有效泛化的区域)来促进训练。