While some convolutional neural networks (CNNs) have achieved great success in object recognition, they struggle to identify objects in images corrupted with different types of common noise patterns. Recently, it was shown that simulating computations in early visual areas at the front of CNNs leads to improvements in robustness to image corruptions. Here, we further explore this result and show that the neuronal representations that emerge from precisely matching the distribution of RF properties found in primate V1 is key for this improvement in robustness. We built two variants of a model with a front-end modeling the primate primary visual cortex (V1): one sampling RF properties uniformly and the other sampling from empirical biological distributions. The model with the biological sampling has a considerably higher robustness to image corruptions that the uniform variant (relative difference of 8.72%). While similar neuronal sub-populations across the two variants have similar response properties and learn similar downstream weights, the impact on downstream processing is strikingly different. This result sheds light on the origin of the improvements in robustness observed in some biologically-inspired models, pointing to the need of precisely mimicking the neuronal representations found in the primate brain.
翻译:尽管某些卷积神经网络(CNN)在目标识别任务中取得了巨大成功,但其在识别被不同类型常见噪声模式破坏的图像时仍存在困难。近期研究表明,在CNN前端模拟早期视觉区域的计算可提升模型对图像损坏的鲁棒性。本文进一步探索这一发现,指出精确匹配灵长类V1区感受野(RF)属性分布的神经元表征是这种鲁棒性提升的关键。我们构建了两种模拟灵长类初级视觉皮层(V1)前端的模型变体:一种采用均匀采样RF属性,另一种则从经验性生物分布中采样。具有生物采样的模型对图像损坏的鲁棒性显著高于均匀采样变体(相对差异8.72%)。尽管两种变体中相似的神经元子群具有相近的响应特性并学习到类似的下游权重,但它们对下游处理的影响却存在显著差异。该结果揭示了某些生物启发模型中鲁棒性提升的起源,表明需要精确模仿灵长类大脑中的神经元表征。