SimSiam is a prominent self-supervised learning method that achieves impressive results in various vision tasks under static environments. However, it has two critical issues: high sensitivity to hyperparameters, especially weight decay, and unsatisfactory performance in online and continual learning, where neuroscientists believe that powerful memory functions are necessary, as in brains. In this paper, we propose PhiNet, inspired by a hippocampal model based on the temporal prediction hypothesis. Unlike SimSiam, which aligns two augmented views of the original image, PhiNet integrates an additional predictor block that estimates the original image representation to imitate the CA1 region in the hippocampus. Moreover, we model the neocortex inspired by the Complementary Learning Systems theory with a momentum encoder block as a slow learner, which works as long-term memory. We demonstrate through analysing the learning dynamics that PhiNet benefits from the additional predictor to prevent the complete collapse of learned representations, a notorious challenge in non-contrastive learning. This dynamics analysis may partially corroborate why this hippocampal model is biologically plausible. Experimental results demonstrate that PhiNet is more robust to weight decay and performs better than SimSiam in memory-intensive tasks like online and continual learning.
翻译:SimSiam是一种重要的自监督学习方法,在静态环境下的多种视觉任务中取得了令人印象深刻的结果。然而,该方法存在两个关键问题:对超参数(尤其是权重衰减)高度敏感,以及在在线学习和持续学习等任务中表现不佳。神经科学家认为,这些任务需要像大脑一样具备强大的记忆功能。本文受基于时间预测假设的海马体模型启发,提出了PhiNet。与SimSiam对齐原始图像的两个增强视图不同,PhiNet引入了一个额外的预测器模块来估计原始图像的表征,以模拟海马体中的CA1区。此外,我们依据互补学习系统理论,以动量编码器模块作为慢速学习器对大脑新皮层进行建模,该模块充当长期记忆。通过分析学习动态,我们证明PhiNet受益于额外的预测器,能够防止学习表征的完全坍塌——这是非对比学习中一个众所周知的挑战。该动态分析可能部分佐证了此海马体模型在生物学上的合理性。实验结果表明,PhiNet对权重衰减具有更强的鲁棒性,并且在在线学习和持续学习等记忆密集型任务中表现优于SimSiam。