Variational autoencoders (VAE) employ Bayesian inference to interpret sensory inputs, mirroring processes that occur in primate vision across both ventral (Higgins et al., 2021) and dorsal (Vafaii et al., 2023) pathways. Despite their success, traditional VAEs rely on continuous latent variables, which deviates sharply from the discrete nature of biological neurons. Here, we developed the Poisson VAE (P-VAE), a novel architecture that combines principles of predictive coding with a VAE that encodes inputs into discrete spike counts. Combining Poisson-distributed latent variables with predictive coding introduces a metabolic cost term in the model loss function, suggesting a relationship with sparse coding which we verify empirically. Additionally, we analyze the geometry of learned representations, contrasting the P-VAE to alternative VAE models. We find that the P-VAEencodes its inputs in relatively higher dimensions, facilitating linear separability of categories in a downstream classification task with a much better (5x) sample efficiency. Our work provides an interpretable computational framework to study brain-like sensory processing and paves the way for a deeper understanding of perception as an inferential process.
翻译:变分自编码器(VAE)采用贝叶斯推理来解释感官输入,这反映了在灵长类动物视觉的腹侧通路(Higgins等人,2021)和背侧通路(Vafaii等人,2023)中发生的过程。尽管取得了成功,传统的VAE依赖于连续的潜变量,这与生物神经元的离散性质存在显著差异。在此,我们开发了泊松VAE(P-VAE),这是一种新颖的架构,它将预测编码原理与一种将输入编码为离散脉冲计数的VAE相结合。将泊松分布的潜变量与预测编码相结合,在模型损失函数中引入了一个代谢成本项,这表明其与稀疏编码存在关联,我们通过实验验证了这一点。此外,我们分析了学习到的表示的几何特性,将P-VAE与其他VAE模型进行了对比。我们发现,P-VAE在相对更高的维度上编码其输入,从而在下游分类任务中促进了类别的线性可分性,并实现了显著更好(5倍)的样本效率。我们的工作提供了一个可解释的计算框架,用于研究类脑的感官处理,并为将感知理解为一种推理过程铺平了道路。