Predictive coding (PC) accounts of perception now form one of the dominant computational theories of the brain, where they prescribe a general algorithm for inference and learning over hierarchical latent probabilistic models. Despite this, they have enjoyed little export to the broader field of machine learning, where comparative generative modelling techniques have flourished. In part, this has been due to the poor performance of models trained with PC when evaluated by both sample quality and marginal likelihood. By adopting the perspective of PC as a variational Bayes algorithm under the Laplace approximation, we identify the source of these deficits to lie in the exclusion of an associated Hessian term in the PC objective function, which would otherwise regularise the sharpness of the probability landscape and prevent over-certainty in the approximate posterior. To remedy this, we make three primary contributions: we begin by suggesting a simple Monte Carlo estimated evidence lower bound which relies on sampling from the Hessian-parameterised variational posterior. We then derive a novel block diagonal approximation to the full Hessian matrix that has lower memory requirements and favourable mathematical properties. Lastly, we present an algorithm that combines our method with standard PC to reduce memory complexity further. We evaluate models trained with our approach against the standard PC framework on image benchmark datasets. Our approach produces higher log-likelihoods and qualitatively better samples that more closely capture the diversity of the data-generating distribution.
翻译:预测编码(PC)作为知觉计算理论已成为大脑主导性计算框架之一,其核心在于为层级隐变量概率模型提出通用的推理与学习算法。然而,该理论在蓬勃发展的比较生成建模技术领域中对机器学习的广泛影响仍十分有限,部分原因在于基于PC训练的模型在样本质量与边际似然评估中表现不佳。通过将PC视为拉普拉斯近似下的变分贝叶斯算法,我们识别出上述缺陷的根源在于PC目标函数中忽略了对应的海森矩阵项——该项本可正则化概率景观的陡峭程度并防止近似后验的过度置信。为修正此问题,我们提出三项核心贡献:首先,提出基于海森参数化变分后验采样的简易蒙特卡洛估计证据下界;其次,推导出全海森矩阵的新型块对角近似,该近似具有更低内存需求与更优数学性质;最后,提出将该方法与标准PC相融合以进一步降低内存复杂度的算法。在图像基准数据集上,我们训练模型并与标准PC框架进行对比评估。结果表明,本方法能产生更高对数似然值与定性更优的样本,更准确地捕捉数据生成分布的多样性。