Recent generalizations of the Hopfield model of associative memories are able to store a number $P$ of random patterns that grows exponentially with the number $N$ of neurons, $P=\exp(\alpha N)$. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architectures widely applied in deep learning. In this work, we consider a generic family of pattern ensembles, and thanks to the statistical mechanics analysis of an auxiliary Random Energy Model, we are able to provide exact asymptotic thresholds for the retrieval of a typical pattern, $\alpha_1$, and lower bounds for the maximum of the load $\alpha$ for which all patterns can be retrieved, $\alpha_c$. Additionally, we characterize the size of the basins of attractions. We discuss in detail the cases of Gaussian and spherical patterns, and show that they display rich and qualitatively different phase diagrams.
翻译:近期对联想记忆霍普菲尔德模型的若干推广能够存储随机模式的数量 $P$ 随神经元数量 $N$ 呈指数增长,即 $P=\exp(\alpha N)$。除了巨大的存储容量外,这些网络的另一个有趣特性是其与注意力机制的关联——该机制是深度学习中广泛应用的Transformer架构的组成部分。在本工作中,我们考虑一类通用的模式集合,借助辅助随机能量模型的统计力学分析,我们能够为典型模式的检索提供精确渐近阈值 $\alpha_1$,并为所有模式均可检索时的负载最大值 $\alpha$ 给出下界 $\alpha_c$。此外,我们表征了吸引域的大小。我们详细讨论了高斯模式和球面模式的情形,并证明它们展现出丰富且性质迥异的相图。