Humans learn quickly even in tasks that contain complex visual information. This is due in part to the efficient formation of compressed representations of visual information, allowing for better generalization and robustness. However, compressed representations alone are insufficient for explaining the high speed of human learning. Reinforcement learning (RL) models that seek to replicate this impressive efficiency may do so through the use of factored representations of tasks. These informationally simplistic representations of tasks are similarly motivated as the use of compressed representations of visual information. Recent studies have connected biological visual perception to disentangled and compressed representations. This raises the question of how humans learn to efficiently represent visual information in a manner useful for learning tasks. In this paper we present a model of human factored representation learning based on an altered form of a $\beta$-Variational Auto-encoder used in a visual learning task. Modelling results demonstrate a trade-off in the informational complexity of model latent dimension spaces, between the speed of learning and the accuracy of reconstructions.
翻译:人类即使在包含复杂视觉信息的任务中也能快速学习,这部分归因于视觉信息压缩表示的有效形成,从而实现更好的泛化能力和鲁棒性。然而,仅靠压缩表示不足以解释人类学习的高速度。试图复现这种惊人效率的强化学习模型可能通过使用任务的因式表示来实现。这些信息简化形式的任务表示与视觉信息压缩表示的应用动机相似。近期研究将生物视觉感知与解纠缠及压缩表示联系起来,这引发了一个问题:人类如何学会以有利于学习任务的方式高效表示视觉信息?本文提出一种基于改进型β-变分自编码器的人类因式表示学习模型,该模型应用于视觉学习任务。建模结果表明,在模型潜在维度空间的信息复杂度与学习速度及重建精度之间存在权衡关系。