Recent generalizations of the Hopfield model of associative memories are able to store a number $P$ of random patterns that grows exponentially with the number $N$ of neurons, $P=\exp(\alpha N)$. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architectures widely applied in deep learning. In this work, we study a generic family of pattern ensembles using a statistical mechanics analysis which gives exact asymptotic thresholds for the retrieval of a typical pattern, $\alpha_1$, and lower bounds for the maximum of the load $\alpha$ for which all patterns can be retrieved, $\alpha_c$, as well as sizes of attraction basins. We discuss in detail the cases of Gaussian and spherical patterns, and show that they display rich and qualitatively different phase diagrams.
翻译:近期对联想记忆Hopfield模型的推广能够存储数量$P$随神经元数量$N$呈指数增长的随机模式,即$P=\exp(\alpha N)$。除了巨大的存储容量外,这些网络另一个有趣特征是与注意力机制的关联——该机制是深度学习中广泛应用的Transformer架构的组成部分。本研究利用统计力学分析研究通用模式系综,给出了典型模式检索的精确渐近阈值$\alpha_1$、所有模式均可检索时负载$\alpha$的最大值下界$\alpha_c$,以及吸引域的大小。我们详细讨论了高斯模式和球面模式的情形,并揭示它们展现出丰富且性质不同的相图。