Parallel Learning by Multitasking Neural Networks

A modern challenge of Artificial Intelligence is learning multiple patterns at once (i.e.parallel learning). While this can not be accomplished by standard Hebbian associative neural networks, in this paper we show how the Multitasking Hebbian Network (a variation on theme of the Hopfield model working on sparse data-sets) is naturally able to perform this complex task. We focus on systems processing in parallel a finite (up to logarithmic growth in the size of the network) amount of patterns, mirroring the low-storage level of standard associative neural networks at work with pattern recognition. For mild dilution in the patterns, the network handles them hierarchically, distributing the amplitudes of their signals as power-laws w.r.t. their information content (hierarchical regime), while, for strong dilution, all the signals pertaining to all the patterns are raised with the same strength (parallel regime). Further, confined to the low-storage setting (i.e., far from the spin glass limit), the presence of a teacher neither alters the multitasking performances nor changes the thresholds for learning: the latter are the same whatever the training protocol is supervised or unsupervised. Results obtained through statistical mechanics, signal-to-noise technique and Monte Carlo simulations are overall in perfect agreement and carry interesting insights on multiple learning at once: for instance, whenever the cost-function of the model is minimized in parallel on several patterns (in its description via Statistical Mechanics), the same happens to the standard sum-squared error Loss function (typically used in Machine Learning).

翻译：人工智能面临的一个现代挑战是同时学习多个模式（即并行学习）。虽然标准赫布型联想神经网络无法实现这一点，但本文展示了多任务赫布网络（一种基于稀疏数据集工作的霍普菲尔德模型变体）如何自然地执行这一复杂任务。我们聚焦于并行处理有限数量（网络规模的对数增长范围内）模式的系统，这反映了标准联想神经网络在模式识别工作中对低存储水平的要求。当模式存在轻度稀释时，网络以层级方式处理它们，将信号幅度根据其信息内容按幂律分布（层级机制）；而在强稀释情况下，所有模式的相关信号以相同强度提升（并行机制）。此外，在低存储设置下（即远离自旋玻璃极限），教师信号的存在既不会改变多任务性能，也不会改变学习阈值：无论训练过程是监督式还是非监督式，这些阈值均保持一致。通过统计力学、信噪比技术和蒙特卡洛模拟获得的结果总体上完全一致，并为同时学习多个模式提供了有趣见解：例如，当模型的代价函数在多个模式上并行最小化时（基于统计力学的描述），标准的平方和误差损失函数（通常用于机器学习）也会呈现相同特性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日