To replace data augmentation, this paper proposed a method called SLAP to intensify experience to speed up machine learning and reduce the sample size. SLAP is a model-independent protocol/function to produce the same output given different transformation variants. SLAP improved the convergence speed of convolutional neural network learning by 83% in the experiments with Gomoku game states, with only one eighth of the sample size compared with data augmentation. In reinforcement learning for Gomoku, using AlphaGo Zero/AlphaZero algorithm with data augmentation as baseline, SLAP reduced the number of training samples by a factor of 8 and achieved similar winning rate against the same evaluator, but it was not yet evident that it could speed up reinforcement learning. The benefits should at least apply to domains that are invariant to symmetry or certain transformations. As future work, SLAP may aid more explainable learning and transfer learning for domains that are not invariant to symmetry, as a small step towards artificial general intelligence.
翻译:为替代数据增强,本文提出了一种名为SLAP的方法,通过强化经验来加速机器学习并减少样本量。SLAP是一种与模型无关的协议/函数,能够在不同变换变体下产生相同输出。在五子棋棋局状态实验中,SLAP将卷积神经网络的学习收敛速度提升了83%,且所需样本量仅为数据增强的八分之一。在五子棋强化学习中,以采用数据增强的AlphaGo Zero/AlphaZero算法为基线,SLAP将训练样本量减少了8倍,并在与相同评估器对战时取得了相近的胜率,但其对强化学习的加速效果尚不明确。该方法至少适用于对称性或特定变换不变的领域。作为迈向通用人工智能的一小步,未来SLAP有望在非对称不变领域辅助更具可解释性的学习与迁移学习。