For the last decade, convolutional neural networks (CNNs) have vastly superseded their predecessors in nearly all vision tasks in artificial intelligence, including object recognition. However, despite abundant advancements, they continue to pale in comparison to biological vision. This chasm has prompted the development of biologically-inspired models that have attempted to mimic the human visual system, primarily at a neural level, which is evaluated using standard dataset benchmarks. However, more work is needed to understand how these models perceive the visual world. This article proposes a state-of-the-art procedure that generates a new metric, Psychophysical-Score, which is grounded in visual psychophysics and is capable of reliably estimating perceptual responses across numerous models -- representing a large range in complexity and biological inspiration. We perform the procedure on twelve models that vary in degree of biological inspiration and complexity, we compare the results against the aggregated results of 2,390 Amazon Mechanical Turk workers who together provided ~2.7 million perceptual responses. Each model's Psychophysical-Score is compared against the state-of-the-art neural activity-based metric, Brain-Score. Our study indicates that models with a high correlation to human perceptual behavior also have a high correlation with the corresponding neural activity.
翻译:过去十年中,卷积神经网络(CNN)在人工智能的几乎所有视觉任务(包括物体识别)中大幅超越了其 predecessors。然而,尽管取得了诸多进展,它们在生物视觉面前仍相形见绌。这一鸿沟促使人们开发出试图模仿人类视觉系统的生物启发模型——这些模型主要在神经层面进行评估,且通常使用标准数据集基准。然而,要理解这些模型如何感知视觉世界,仍需更多研究。本文提出了一种先进流程,生成一种名为“心理物理评分”(Psychophysical-Score)的新度量,该度量基于视觉心理物理学,能够可靠地估计众多模型(涵盖从复杂度到生物启发程度的广泛范围)的感知响应。我们对十二个在生物启发程度和复杂度上各异的模型执行该流程,并将结果与 2,390 名 Amazon Mechanical Turk 工人提供的约 270 万条感知响应汇总结果进行比较。每个模型的心理物理评分均与基于神经活动的先进度量“脑评分”(Brain-Score)进行对比。我们的研究表明,与人类感知行为高度相关的模型,其与相应神经活动的相关性也较高。