The modern Hopfield network generalizes the classical Hopfield network by allowing for sharper interaction functions. This increases the capacity of the network as an autoassociative memory as nearby learned attractors will not interfere with one another. However, the implementation of the network relies on applying large exponents to the dot product of memory vectors and probe vectors. If the dimension of the data is large the calculation can be very large and result in problems when using floating point numbers in a practical implementation. We describe this problem in detail, modify the original network description to mitigate the problem, and show the modification will not alter the networks' dynamics during update or training. We also show our modification greatly improves hyperparameter selection for the modern Hopfield network, removing hyperparameter dependence on the interaction vertex and resulting in an optimal region of hyperparameters that does not significantly change with the interaction vertex as it does in the original network.
翻译:现代Hopfield网络通过引入更锐利的交互函数,推广了经典Hopfield网络。这增强了网络作为自联想记忆的容量,因为邻近的学习吸引子不会相互干扰。然而,该网络的实现依赖于对记忆向量与探针向量点积施加高次幂运算。当数据维度较大时,计算结果可能极其庞大,导致在实际使用浮点数实现时出现问题。我们详细描述了该问题,通过修改原始网络描述来缓解此问题,并证明该修改不会改变网络在更新或训练过程中的动力学特性。我们还证明,我们的改进显著优化了现代Hopfield网络的超参数选择,消除了超参数对交互顶点的依赖,从而产生一个不随交互顶点显著变化的最优超参数区域——这与原始网络中超参数随交互顶点变化的情况形成鲜明对比。