The purpose of this work is to investigate the soundness and utility of a neural network-based approach as a framework for exploring the impact of image enhancement techniques on visual cortex activation. In a preliminary study, we prepare a set of state-of-the-art brain encoding models, selected among the top 10 methods that participated in The Algonauts Project 2023 Challenge [16]. We analyze their ability to make valid predictions about the effects of various image enhancement techniques on neural responses. Given the impossibility of acquiring the actual data due to the high costs associated with brain imaging procedures, our investigation builds up on a series of experiments. Specifically, we analyze the ability of brain encoders to estimate the cerebral reaction to various augmentations by evaluating the response to augmentations targeting objects (i.e., faces and words) with known impact on specific areas. Moreover, we study the predicted activation in response to objects unseen during training, exploring the impact of semantically out-of-distribution stimuli. We provide relevant evidence for the generalization ability of the models forming the proposed framework, which appears to be promising for the identification of the optimal visual augmentation filter for a given task, model-driven design strategies as well as for AR and VR applications.
翻译:本研究旨在探究基于神经网络的方法作为探索图像增强技术对视觉皮层激活影响的框架的合理性与实用性。在一项初步研究中,我们构建了一套最先进的大脑编码模型,这些模型选自参与2023年阿尔戈英雄项目挑战赛[16]的前十名方法。我们分析了这些模型有效预测各类图像增强技术对神经响应影响的能力。鉴于脑成像程序成本高昂而无法获取实际数据,我们的研究建立在一系列实验基础上。具体而言,我们通过评估针对特定区域具有已知影响的目标物体(即人脸和文字)增强的响应,分析了大脑编码器估计各类增强技术引发大脑反应的能力。此外,我们研究了模型对训练期间未见物体所预测的激活响应,探索了语义分布外刺激的影响。我们为构成该框架的模型泛化能力提供了相关证据,该框架在识别特定任务的最佳视觉增强滤波器、模型驱动设计策略以及增强现实与虚拟现实应用方面展现出良好前景。