Generative Autoregressive Neural Networks (ARNNs) have recently demonstrated exceptional results in image and language generation tasks, contributing to the growing popularity of generative models in both scientific and commercial applications. This work presents an exact mapping of the Boltzmann distribution of binary pairwise interacting systems into autoregressive form. The resulting ARNN architecture has weights and biases of its first layer corresponding to the Hamiltonian's couplings and external fields, featuring widely used structures such as the residual connections and a recurrent architecture with clear physical meanings. Moreover, its architecture's explicit formulation enables the use of statistical physics techniques to derive new ARNNs for specific systems. As examples, new effective ARNN architectures are derived from two well-known mean-field systems, the Curie-Weiss and Sherrington-Kirkpatrick models, showing superior performance in approximating the Boltzmann distributions of the corresponding physics model compared to other commonly used architectures. The connection established between the physics of the system and the neural network architecture provides a means to derive new architectures for different interacting systems and interpret existing ones from a physical perspective.
翻译:生成式自回归神经网络近年来在图像和语言生成任务中取得了卓越成果,推动了生成模型在科学和商业应用领域的日益普及。本文提出了二元成对相互作用系统玻尔兹曼分布到自回归形式的精确映射。由此产生的自回归神经网络架构中,第一层的权重和偏置对应于哈密顿量的耦合项和外场,并包含残差连接和具有明确物理意义的循环架构等广泛使用的结构。此外,该架构的显式公式使得能够运用统计物理技术为特定系统推导新的自回归神经网络。作为示例,本文从两个经典的均场系统(居里-外斯模型和舍林顿-柯克帕特里克模型)推导出新型高效自回归神经网络架构,在逼近对应物理模型玻尔兹曼分布的性能上优于其他常用架构。通过建立系统物理与神经网络架构之间的关联,既可为不同相互作用系统推导新架构,也可从物理视角解读现有架构。