Second Sight: Using brain-optimized encoding models to align image distributions with human brain activity

Two recent developments have accelerated progress in image reconstruction from human brain activity: large datasets that offer samples of brain activity in response to many thousands of natural scenes, and the open-sourcing of powerful stochastic image-generators that accept both low- and high-level guidance. Most work in this space has focused on obtaining point estimates of the target image, with the ultimate goal of approximating literal pixel-wise reconstructions of target images from the brain activity patterns they evoke. This emphasis belies the fact that there is always a family of images that are equally compatible with any evoked brain activity pattern, and the fact that many image-generators are inherently stochastic and do not by themselves offer a method for selecting the single best reconstruction from among the samples they generate. We introduce a novel reconstruction procedure (Second Sight) that iteratively refines an image distribution to explicitly maximize the alignment between the predictions of a voxel-wise encoding model and the brain activity patterns evoked by any target image. We show that our process converges on a distribution of high-quality reconstructions by refining both semantic content and low-level image details across iterations. Images sampled from these converged image distributions are competitive with state-of-the-art reconstruction algorithms. Interestingly, the time-to-convergence varies systematically across visual cortex, with earlier visual areas generally taking longer and converging on narrower image distributions, relative to higher-level brain areas. Second Sight thus offers a succinct and novel method for exploring the diversity of representations across visual brain areas.

翻译：两项最新进展加速了从人类大脑活动进行图像重建的研究：一是大型数据集的构建，提供了对数千张自然场景响应的大脑活动样本；二是开源了可接受低层和高层引导的强大随机图像生成器。该领域的大多数研究聚焦于获取目标图像的点估计，最终目标是近似实现从诱发大脑活动模式到目标图像的逐像素字面重建。这种重心掩盖了一个事实：任何诱发的大脑活动模式总是对应一系列同样兼容的图像，同时许多图像生成器本质上是随机的，本身并不提供从生成样本中选出单一最佳重建的方法。我们提出了一种新型重建流程（第二视觉），该流程通过迭代优化图像分布，明确最大化体素级编码模型预测与任何目标图像所诱发大脑活动模式之间的对齐。结果显示，我们的流程通过跨迭代精炼语义内容和低层图像细节，收敛于高质量重建的分布。从这些收敛图像分布中采样的图像可与最先进的重建算法相媲美。有趣的是，收敛时间在大脑视觉皮层中呈现系统性差异：相较于高级脑区，早期视觉区域通常需要更长时间，并收敛于更窄的图像分布。因此，第二视觉为探索视觉脑区表征的多样性提供了一种简洁且新颖的方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日