Deep neural networks (DNNs) are machine learning algorithms that have revolutionised computer vision due to their remarkable successes in tasks like object classification and segmentation. The success of DNNs as computer vision algorithms has led to the suggestion that DNNs may also be good models of human visual perception. We here review evidence regarding current DNNs as adequate behavioural models of human core object recognition. To this end, we argue that it is important to distinguish between statistical tools and computational models, and to understand model quality as a multidimensional concept where clarity about modelling goals is key. Reviewing a large number of psychophysical and computational explorations of core object recognition performance in humans and DNNs, we argue that DNNs are highly valuable scientific tools but that as of today DNNs should only be regarded as promising -- but not yet adequate -- computational models of human core object recognition behaviour. On the way we dispel a number of myths surrounding DNNs in vision science.
翻译:深度神经网络(DNNs)是一类机器学习算法,因其在目标分类与分割等任务中的卓越表现而彻底改变了计算机视觉领域。DNNs作为计算机视觉算法的成功引发了这样一种观点:DNNs也可能成为人类视觉感知的良好模型。本文综述了当前DNNs作为人类核心物体识别行为模型的充分性证据。为此,我们强调需区分统计工具与计算模型,并理解模型质量是一个多维概念——其中明确建模目标至关重要。通过梳理大量关于人类与DNN核心物体识别能力的心理物理学及计算探索研究,我们认为DNNs是非常有价值的科学工具,但就当前而言,DNNs仅应被视为有前景的——而尚未达到充分的——人类核心物体识别行为的计算模型。在论述过程中,我们澄清了视觉科学中围绕DNNs的若干迷思。