The reconstruction of images observed by subjects from fMRI data collected during visual stimuli has made significant strides in the past decade, thanks to the availability of extensive fMRI datasets and advancements in generative models for image generation. However, the application of visual reconstruction has remained limited. Reconstructing visual imagination presents a greater challenge, with potentially revolutionary applications ranging from aiding individuals with disabilities to verifying witness accounts in court. The primary hurdles in this field are the absence of data collection protocols for visual imagery and the lack of datasets on the subject. Traditionally, fMRI-to-image relies on data collected from subjects exposed to visual stimuli, which poses issues for generating visual imagery based on the difference of brain activity between visual stimulation and visual imagery. For the first time, we have compiled a substantial dataset (around 6h of scans) on visual imagery along with a proposed data collection protocol. We then train a modified version of an fMRI-to-image model and demonstrate the feasibility of reconstructing images from two modes of imagination: from memory and from pure imagination. This marks an important step towards creating a technology that allow direct reconstruction of visual imagery.
翻译:过去十年间,得益于大量fMRI数据集的公开以及图像生成模型的进步,基于视觉刺激期间收集的fMRI数据重建受试者所观察图像的技术取得了显著进展。然而,视觉重建的应用仍十分有限。重建视觉想象是更大的挑战,其潜在革命性应用涵盖从辅助残障人士到在法庭上验证证词等各个方面。该领域的主要障碍在于缺乏视觉想象数据采集规范以及相关数据集。传统fMRI到图像的转换依赖于受试者在视觉刺激下收集的数据,这导致基于视觉刺激与视觉想象之间脑活动差异来生成视觉图像时存在诸多问题。我们首次编制了大规模的视觉想象数据集(约6小时扫描数据),并提出了相应的数据采集规范。我们随后训练了改进版fMRI到图像模型,验证了从两种想象模式——记忆想象与纯粹想象——重建图像的可行性。这标志着在创建直接重建视觉想象的技术道路上迈出了重要一步。